redis-developer
diff --git a/‎.github/ignore-notebooks.txt‎
Lines changed: 2 additions & 2 deletions b/‎.github/ignore-notebooks.txt‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 2 deletions b/‎README.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎…ntic-cache/semantic_caching_gemini.ipynb‎ ‎…c-cache/00_semantic_caching_gemini.ipynb‎python-recipes/semantic-cache/semantic_caching_gemini.ipynb renamed to python-recipes/semantic-cache/00_semantic_caching_gemini.ipynb
Lines changed: 1 addition & 1 deletion b/‎…ntic-cache/semantic_caching_gemini.ipynb‎ ‎…c-cache/00_semantic_caching_gemini.ipynb‎python-recipes/semantic-cache/semantic_caching_gemini.ipynb renamed to python-recipes/semantic-cache/00_semantic_caching_gemini.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎…/semantic-cache/doc2cache_llama3_1.ipynb‎ ‎…mantic-cache/01_doc2cache_llama3_1.ipynb‎python-recipes/semantic-cache/doc2cache_llama3_1.ipynb renamed to python-recipes/semantic-cache/01_doc2cache_llama3_1.ipynb b/‎…/semantic-cache/doc2cache_llama3_1.ipynb‎ ‎…mantic-cache/01_doc2cache_llama3_1.ipynb‎python-recipes/semantic-cache/doc2cache_llama3_1.ipynb renamed to python-recipes/semantic-cache/01_doc2cache_llama3_1.ipynb
diff --git a/‎python-recipes/semantic-cache/02_semantic_cache_optimization.ipynb‎
Lines changed: 315 additions & 0 deletions b/‎python-recipes/semantic-cache/02_semantic_cache_optimization.ipynb‎
Lines changed: 315 additions & 0 deletions
@@ -1,5 +1,5 @@
 01_crewai_langgraph_redis
-doc2cache_llama3_1
-semantic_caching_gemini
+01_doc2cache_llama3_1
+00_semantic_caching_gemini
 01_collaborative_filtering
 05_nvidia_ai_rag_redis
@@ -77,8 +77,8 @@ An estimated 31% of LLM queries are potentially redundant ([source](https://arxi
 
 | Recipe | Description |
 | --- | --- |
-| [/semantic-cache/doc2cache_llama3_1.ipynb](python-recipes/semantic-cache/doc2cache_llama3_1.ipynb) | Build a semantic cache using the Doc2Cache framework and Llama3.1 |
-| [/semantic-cache/semantic_caching_gemini.ipynb](python-recipes/semantic-cache/semantic_caching_gemini.ipynb) | Build a semantic cache with Redis and Google Gemini |
+| [/semantic-cache/01_doc2cache_llama3_1.ipynb](python-recipes/semantic-cache/01_doc2cache_llama3_1.ipynb) | Build a semantic cache using the Doc2Cache framework and Llama3.1 |
+| [/semantic-cache/00_semantic_caching_gemini.ipynb](python-recipes/semantic-cache/00_semantic_caching_gemini.ipynb) | Build a semantic cache with Redis and Google Gemini |
 
 ### Semantic Routing
 Routing is a simple and effective way of preventing misuses with your AI application or for creating branching logic between data sources etc.
 
@@ -8,7 +8,7 @@
       "source": [
         "# Building a Semantic Cache with Redis and VertexAI Gemini Model\n",
         "\n",
-        "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/semantic_caching_gemini.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"
+        "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/00_semantic_caching_gemini.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"
       ]
     },
     {
 
@@ -0,0 +1,315 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Optimize semantic cache threshold with RedisVL\n",
+    "\n",
+    "> **Note:** Threshold optimization in redisvl relies on `python > 3.9.`\n",
+    "\n",
+    "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/02_semantic_cache_optimization.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# CacheThresholdOptimizer\n",
+    "\n",
+    "Let's say you setup the following semantic cache with a distance_threshold of `X` and store the entries:\n",
+    "\n",
+    "- prompt: `what is the capital of france?` response: `paris`\n",
+    "- prompt: `what is the capital of morocco?` response: `rabat`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n",
+      "  warnings.warn(\n",
+      "/Users/robert.shelton/.pyenv/versions/3.11.9/lib/python3.11/site-packages/huggingface_hub/file_download.py:1142: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.\n",
+      "  warnings.warn(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from redisvl.extensions.llmcache import SemanticCache\n",
+    "\n",
+    "sem_cache = SemanticCache(\n",
+    "    name=\"sem_cache\",                    # underlying search index name\n",
+    "    redis_url=\"redis://localhost:6379\",  # redis connection url string\n",
+    "    distance_threshold=0.5               # semantic cache distance threshold\n",
+    ")\n",
+    "\n",
+    "paris_key = sem_cache.store(prompt=\"what is the capital of france?\", response=\"paris\")\n",
+    "rabat_key = sem_cache.store(prompt=\"what is the capital of morocco?\", response=\"rabat\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This works well but we want to make sure the cache only applies for the appropriate questions. If we test the cache with a question we don't want a response to we see that the current distance_threshold is too high. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',\n",
+       "  'prompt': 'what is the capital of france?',\n",
+       "  'response': 'paris',\n",
+       "  'vector_distance': 0.421104669571,\n",
+       "  'inserted_at': 1741039231.99,\n",
+       "  'updated_at': 1741039231.99,\n",
+       "  'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "sem_cache.check(\"what's the capital of britain?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define test_data and optimize\n",
+    "\n",
+    "With the `CacheThresholdOptimizer` you can quickly tune the distance threshold by providing some test data in the form:\n",
+    "\n",
+    "```json\n",
+    "[\n",
+    "    {\n",
+    "        \"query\": \"What's the capital of Britain?\",\n",
+    "        \"query_match\": \"\"\n",
+    "    },\n",
+    "    {\n",
+    "        \"query\": \"What's the capital of France??\",\n",
+    "        \"query_match\": paris_key\n",
+    "    },\n",
+    "    {\n",
+    "        \"query\": \"What's the capital city of Morocco?\",\n",
+    "        \"query_match\": rabat_key\n",
+    "    },\n",
+    "]\n",
+    "```\n",
+    "\n",
+    "The threshold optimizer will then efficiently execute and score different threshold against the what is currently populated in your cache and automatically update the threshold of the cache to the best setting"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Distance threshold before: 0.5 \n",
+      "\n",
+      "Distance threshold after: 0.13050847457627118 \n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from redisvl.utils.optimize import CacheThresholdOptimizer\n",
+    "\n",
+    "test_data = [\n",
+    "    {\n",
+    "        \"query\": \"What's the capital of Britain?\",\n",
+    "        \"query_match\": \"\"\n",
+    "    },\n",
+    "    {\n",
+    "        \"query\": \"What's the capital of France??\",\n",
+    "        \"query_match\": paris_key\n",
+    "    },\n",
+    "    {\n",
+    "        \"query\": \"What's the capital city of Morocco?\",\n",
+    "        \"query_match\": rabat_key\n",
+    "    },\n",
+    "]\n",
+    "\n",
+    "print(f\"Distance threshold before: {sem_cache.distance_threshold} \\n\")\n",
+    "optimizer = CacheThresholdOptimizer(sem_cache, test_data)\n",
+    "optimizer.optimize()\n",
+    "print(f\"Distance threshold after: {sem_cache.distance_threshold} \\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We can also see that we no longer match on the incorrect example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[]"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "sem_cache.check(\"what's the capital of britain?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "But still match on highly relevant prompts:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[{'entry_id': 'c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3',\n",
+       "  'prompt': 'what is the capital of france?',\n",
+       "  'response': 'paris',\n",
+       "  'vector_distance': 0.0835866332054,\n",
+       "  'inserted_at': 1741039231.99,\n",
+       "  'updated_at': 1741039231.99,\n",
+       "  'key': 'sem_cache:c990cc06e5e77570e5f03360426d2b7f947cbb5a67daa8af8164bfe0b3e24fe3'}]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "sem_cache.check(\"what's the capital city of france?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Additional configuration\n",
+    "\n",
+    "By default threshold optimization is performed based on the highest `F1` score but can also be configured to rank results based on `precision` and `recall` by specifying the `eval_metric` keyword argument. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Distance threshold before: 0.5 \n",
+      "\n"
+     ]
+    },
+    {
+     "ename": "NameError",
+     "evalue": "name 'CacheThresholdOptimizer' is not defined",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
+      "Cell \u001b[0;32mIn[2], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDistance threshold before: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00msem_cache\u001b[38;5;241m.\u001b[39mdistance_threshold\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m \u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m----> 2\u001b[0m optimizer \u001b[38;5;241m=\u001b[39m \u001b[43mCacheThresholdOptimizer\u001b[49m(sem_cache, test_data, eval_metric\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mprecision\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m      3\u001b[0m optimizer\u001b[38;5;241m.\u001b[39moptimize()\n\u001b[1;32m      4\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDistance threshold after: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00msem_cache\u001b[38;5;241m.\u001b[39mdistance_threshold\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m \u001b[39m\u001b[38;5;130;01m\\n\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n",
+      "\u001b[0;31mNameError\u001b[0m: name 'CacheThresholdOptimizer' is not defined"
+     ]
+    }
+   ],
+   "source": [
+    "print(f\"Distance threshold before: {sem_cache.distance_threshold} \\n\")\n",
+    "optimizer = CacheThresholdOptimizer(sem_cache, test_data, eval_metric=\"precision\")\n",
+    "optimizer.optimize()\n",
+    "print(f\"Distance threshold after: {sem_cache.distance_threshold} \\n\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(f\"Distance threshold before: {sem_cache.distance_threshold} \\n\")\n",
+    "optimizer = CacheThresholdOptimizer(sem_cache, test_data, eval_metric=\"recall\")\n",
+    "optimizer.optimize()\n",
+    "print(f\"Distance threshold after: {sem_cache.distance_threshold} \\n\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Note**: the CacheThresholdOptimizer class also exposes an optional `opt_fn` which can be leveraged to define more custom logic. See implementation within [source code for reference](https://github.com/redis/redis-vl-python/blob/18ff1008c5a40353c97c176d3d30028a87ff777a/redisvl/utils/optimize/cache.py#L48-L49)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Cleanup"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sem_cache.delete()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.9"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
Original file line number	Diff line number	Diff line change
`@@ -8,7 +8,7 @@`
`8`	`8`	`"source": [`
`9`	`9`	`"# Building a Semantic Cache with Redis and VertexAI Gemini Model\n",`
`10`	`10`	`"\n",`
`11`		`- "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/semantic_caching_gemini.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"`
	`11`	`+ "<a href=\"https://colab.research.google.com/github/redis-developer/redis-ai-resources/blob/main/python-recipes/semantic-cache/00_semantic_caching_gemini.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n"`
`12`	`12`	`]`
`13`	`13`	`},`
`14`	`14`	`{`