diff --git a/docs/api/query.rst b/docs/api/query.rst index b8e1c875f..4602f4f75 100644 --- a/docs/api/query.rst +++ b/docs/api/query.rst @@ -251,3 +251,13 @@ SQLQuery SQLQuery translates SQL SELECT statements into Redis FT.SEARCH or FT.AGGREGATE commands. The SQL syntax supports WHERE clauses, field selection, ordering, and parameterized queries for vector similarity searches. + +.. note:: + SQLQuery accepts a ``sql_redis_options`` dictionary that is passed through to + ``sql-redis`` executor creation. The most common option is + ``schema_cache_strategy``: + + - ``"lazy"`` (default) loads schemas on demand, which keeps one-off or + narrow queries cheaper. + - ``"load_all"`` eagerly loads all schemas up front, which can help when + running many SQL queries across many indexes. diff --git a/docs/concepts/queries.md b/docs/concepts/queries.md index cb4873e6f..05adfe25e 100644 --- a/docs/concepts/queries.md +++ b/docs/concepts/queries.md @@ -180,6 +180,44 @@ query = SQLQuery(""" results = index.query(query) ``` +`SQLQuery` also accepts `sql_redis_options`, which are forwarded to the +underlying `sql-redis` executor. This is mainly useful for tuning schema +caching behavior. + +```python +query = SQLQuery( + """ + SELECT title, price, category + FROM products + WHERE category = 'electronics' AND price < 100 + """, + sql_redis_options={"schema_cache_strategy": "lazy"}, +) +``` + +- `"lazy"` (default) loads schemas only when a query touches an index, which + keeps startup and one-off queries cheaper. +- `"load_all"` preloads all schemas up front, which can help repeated query + workloads that span many indexes. + +For TEXT fields with `sql-redis >= 0.4.0`: + +- `=` performs exact phrase or exact-term matching +- `LIKE` performs prefix/suffix/contains matching using SQL `%` wildcards +- `fuzzy(field, 'term')` performs typo-tolerant matching +- `fulltext(field, 'query')` performs tokenized search + +```python +query = SQLQuery("SELECT * FROM products WHERE title = 'gaming laptop'") +query = SQLQuery("SELECT * FROM products WHERE title LIKE 'lap%'") +query = SQLQuery("SELECT * FROM products WHERE fuzzy(title, 'laptap')") +query = SQLQuery("SELECT * FROM products WHERE fulltext(title, 'laptop OR tablet')") +``` + +Use `=` when you want an exact phrase, `LIKE` for prefix/suffix/contains +patterns, `fuzzy()` for typo-tolerant lookup, and `fulltext()` for tokenized +search operators such as `OR`, optional terms, or proximity. + **Aggregations and grouping:** ```python diff --git a/docs/user_guide/12_sql_to_redis_queries.ipynb b/docs/user_guide/12_sql_to_redis_queries.ipynb index 2d738fcb7..014474b69 100644 --- a/docs/user_guide/12_sql_to_redis_queries.ipynb +++ b/docs/user_guide/12_sql_to_redis_queries.ipynb @@ -1,1823 +1,1845 @@ { - "cells": [ - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Write SQL Queries for Redis\n", - "\n", - "While Redis does not natively support SQL, RedisVL provides a `SQLQuery` class that translates SQL-like queries into Redis queries.\n", - "\n", - "The `SQLQuery` class wraps the [`sql-redis`](https://pypi.org/project/sql-redis/) package. This package is not installed by default, so install it with:\n", - "\n", - "```bash\n", - "pip install redisvl[sql-redis]\n", - "```\n", - "\n", - "## Prerequisites\n", - "\n", - "Before you begin, ensure you have:\n", - "- Installed RedisVL with SQL support: `pip install redisvl[sql-redis]`\n", - "- A running Redis instance ([Redis 8+](https://redis.io/downloads/) or [Redis Cloud](https://redis.io/cloud))\n", - "\n", - "## What You'll Learn\n", - "\n", - "By the end of this guide, you will be able to:\n", - "- Write SQL-like queries for Redis using `SQLQuery`\n", - "- Translate SELECT, WHERE, and ORDER BY clauses to Redis queries\n", - "- Combine SQL queries with vector search\n", - "- Use aggregate functions and grouping\n", - "- Query geographic data with `geo_distance()`\n", - "- Filter and extract date/time data with `YEAR()`, `MONTH()`, and `DATE_FORMAT()`\n", - "\n", - "## Table of Contents\n", - "\n", - "1. [Define the schema](#define-the-schema)\n", - "2. [Create sample dataset](#create-sample-dataset)\n", - "3. [Create a SearchIndex](#create-a-searchindex)\n", - "4. [Load data](#load-data)\n", - "5. [Write SQL queries](#write-sql-queries)\n", - "6. [Query types](#query-types)\n", - " - [Text searches](#text-searches)\n", - " - [Aggregations](#aggregations)\n", - " - [Vector search](#vector-search)\n", - " - [Geographic queries](#geographic-queries)\n", - " - [Date and datetime queries](#date-and-datetime-queries)\n", - "7. [Async support](#async-support)\n", - "8. [Additional query examples](#additional-query-examples)\n", - "9. [Cleanup](#cleanup)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Define the schema" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "from redisvl.utils.vectorize import HFTextVectorizer\n", - "\n", - "hf = HFTextVectorizer()\n", - "\n", - "schema = {\n", - " \"index\": {\n", - " \"name\": \"user_simple\",\n", - " \"prefix\": \"user_simple_docs\",\n", - " \"storage_type\": \"json\",\n", - " },\n", - " \"fields\": [\n", - " {\"name\": \"user\", \"type\": \"tag\"},\n", - " {\"name\": \"region\", \"type\": \"tag\"},\n", - " {\"name\": \"job\", \"type\": \"tag\"},\n", - " {\"name\": \"job_description\", \"type\": \"text\"},\n", - " {\"name\": \"age\", \"type\": \"numeric\"},\n", - " {\"name\": \"office_location\", \"type\": \"geo\"},\n", - " {\n", - " \"name\": \"job_embedding\",\n", - " \"type\": \"vector\",\n", - " \"attrs\": {\n", - " \"dims\": len(hf.embed(\"get embed length\")),\n", - " \"distance_metric\": \"cosine\",\n", - " \"algorithm\": \"flat\",\n", - " \"datatype\": \"float32\"\n", - " }\n", - " }\n", - " ]\n", - "}" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Create sample dataset" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "# Office locations use \"longitude,latitude\" format (lon,lat - Redis convention)\n", - "# San Francisco: -122.4194, 37.7749\n", - "# Chicago: -87.6298, 41.8781\n", - "# New York: -73.9857, 40.7580\n", - "data = [\n", - " {\n", - " 'user': 'john',\n", - " 'age': 34,\n", - " 'job': 'software engineer',\n", - " 'region': 'us-west',\n", - " 'job_description': 'Designs, develops, and maintains software applications and systems.',\n", - " 'office_location': '-122.4194,37.7749' # San Francisco\n", - " },\n", - " {\n", - " 'user': 'bill',\n", - " 'age': 54,\n", - " 'job': 'engineer',\n", - " 'region': 'us-central',\n", - " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.',\n", - " 'office_location': '-87.6298,41.8781' # Chicago\n", - " },\n", - " {\n", - " 'user': 'mary',\n", - " 'age': 24,\n", - " 'job': 'doctor',\n", - " 'region': 'us-central',\n", - " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", - " 'office_location': '-87.6298,41.8781' # Chicago\n", - " },\n", - " {\n", - " 'user': 'joe',\n", - " 'age': 27,\n", - " 'job': 'dentist',\n", - " 'region': 'us-east',\n", - " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", - " 'office_location': '-73.9857,40.7580' # New York\n", - " },\n", - " {\n", - " 'user': 'stacy',\n", - " 'age': 61,\n", - " 'job': 'project manager',\n", - " 'region': 'us-west',\n", - " 'job_description': 'Plans, organizes, and oversees projects from inception to completion.',\n", - " 'office_location': '-122.4194,37.7749' # San Francisco\n", - " }\n", - "]\n", - "\n", - "data = [\n", - " { \n", - " **d,\n", - " \"job_embedding\": hf.embed(f\"{d['job_description']=} {d['job']=}\"),\n", - " } \n", - " for d in data\n", - "]" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Create a SearchIndex\n", - "\n", - "With the schema and sample dataset ready, create a `SearchIndex`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Bring your own Redis connection instance\n", - "\n", - "This is ideal in scenarios where you have custom settings on the connection instance or if your application will share a connection pool:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "from redisvl.index import SearchIndex\n", - "from redis import Redis\n", - "\n", - "client = Redis.from_url(\"redis://localhost:6379\")\n", - "index = SearchIndex.from_dict(schema, redis_client=client)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Let the index manage the connection instance\n", - "\n", - "This is ideal for simple cases:" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "index = SearchIndex.from_dict(schema, redis_url=\"redis://localhost:6379\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create the index\n", - "\n", - "Now that we are connected to Redis, we need to run the create command." - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "index.create(overwrite=True, drop=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Load data\n", - "\n", - "Load the sample dataset to Redis.\n", - "\n", - "### Validate data entries on load\n", - "RedisVL uses pydantic validation under the hood to ensure loaded data is valid and conforms to your schema. This setting is optional and can be configured via `validate_on_load=True` in the `SearchIndex` class.\n", - "\n", - "> **Note**: This guide omits `validate_on_load` because GEO fields use `longitude,latitude` format (Redis convention), which differs from the validation expectation. A future RedisVL release will align GEO validation with Redis conventions." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "['user_simple_docs:01KKW5E98T5J7YC5JT3CR0GG9F', 'user_simple_docs:01KKW5E98T5J7YC5JT3CR0GG9G', 'user_simple_docs:01KKW5E98T5J7YC5JT3CR0GG9H', 'user_simple_docs:01KKW5E98T5J7YC5JT3CR0GG9J', 'user_simple_docs:01KKW5E98T5J7YC5JT3CR0GG9K']\n" - ] - } - ], - "source": [ - "keys = index.load(data)\n", - "\n", - "print(keys)" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Write SQL queries\n", - "\n", - "First, let's test a simple select statement such as the one below." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "from redisvl.query import SQLQuery\n", - "\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, age\n", - " FROM user_simple\n", - " WHERE age > 17\n", - " \"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Check the created query string" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "'FT.SEARCH user_simple \"@age:[(17 +inf]\" RETURN 4 user region job age'" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Executing the query" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "[{'user': 'mary', 'region': 'us-central', 'job': 'doctor', 'age': '24'},\n", - " {'user': 'joe', 'region': 'us-east', 'job': 'dentist', 'age': '27'},\n", - " {'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", - " {'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'},\n", - " {'user': 'john',\n", - " 'region': 'us-west',\n", - " 'job': 'software engineer',\n", - " 'age': '34'}]" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Query types\n", - "\n", - "### Conditional operators" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@age:[(17 +inf] @region:{us\\-west}\" RETURN 4 user region job age\n" - ] + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Write SQL Queries for Redis\n", + "\n", + "While Redis does not natively support SQL, RedisVL provides a `SQLQuery` class that translates SQL-like queries into Redis queries.\n", + "\n", + "The `SQLQuery` class wraps the [`sql-redis`](https://pypi.org/project/sql-redis/) package. This package is not installed by default, so install it with:\n", + "\n", + "```bash\n", + "pip install redisvl[sql-redis]\n", + "```\n", + "\n", + "## Prerequisites\n", + "\n", + "Before you begin, ensure you have:\n", + "- Installed RedisVL with SQL support: `pip install redisvl[sql-redis]`\n", + "- A running Redis instance ([Redis 8+](https://redis.io/downloads/) or [Redis Cloud](https://redis.io/cloud))\n", + "\n", + "## What You'll Learn\n", + "\n", + "By the end of this guide, you will be able to:\n", + "- Write SQL-like queries for Redis using `SQLQuery`\n", + "- Translate SELECT, WHERE, and ORDER BY clauses to Redis queries\n", + "- Combine SQL queries with vector search\n", + "- Use aggregate functions and grouping\n", + "- Query geographic data with `geo_distance()`\n", + "- Filter and extract date/time data with `YEAR()`, `MONTH()`, and `DATE_FORMAT()`\n", + "\n", + "## Table of Contents\n", + "\n", + "1. [Define the schema](#define-the-schema)\n", + "2. [Create sample dataset](#create-sample-dataset)\n", + "3. [Create a SearchIndex](#create-a-searchindex)\n", + "4. [Load data](#load-data)\n", + "5. [Write SQL queries](#write-sql-queries)\n", + "6. [Query types](#query-types)\n", + " - [Text searches](#text-searches)\n", + " - [Aggregations](#aggregations)\n", + " - [Vector search](#vector-search)\n", + " - [Geographic queries](#geographic-queries)\n", + " - [Date and datetime queries](#date-and-datetime-queries)\n", + "7. [Async support](#async-support)\n", + "8. [Additional query examples](#additional-query-examples)\n", + "9. [Cleanup](#cleanup)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Define the schema" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/robert.shelton/Documents/redis-vl-python/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", + " from .autonotebook import tqdm as notebook_tqdm\n" + ] + } + ], + "source": [ + "from redisvl.utils.vectorize import HFTextVectorizer\n", + "\n", + "hf = HFTextVectorizer()\n", + "\n", + "schema = {\n", + " \"index\": {\n", + " \"name\": \"user_simple\",\n", + " \"prefix\": \"user_simple_docs\",\n", + " \"storage_type\": \"json\",\n", + " },\n", + " \"fields\": [\n", + " {\"name\": \"user\", \"type\": \"tag\"},\n", + " {\"name\": \"region\", \"type\": \"tag\"},\n", + " {\"name\": \"job\", \"type\": \"tag\"},\n", + " {\"name\": \"job_description\", \"type\": \"text\"},\n", + " {\"name\": \"age\", \"type\": \"numeric\"},\n", + " {\"name\": \"office_location\", \"type\": \"geo\"},\n", + " {\n", + " \"name\": \"job_embedding\",\n", + " \"type\": \"vector\",\n", + " \"attrs\": {\n", + " \"dims\": len(hf.embed(\"get embed length\")),\n", + " \"distance_metric\": \"cosine\",\n", + " \"algorithm\": \"flat\",\n", + " \"datatype\": \"float32\"\n", + " }\n", + " }\n", + " ]\n", + "}" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create sample dataset" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "# Office locations use \"longitude,latitude\" format (lon,lat - Redis convention)\n", + "# San Francisco: -122.4194, 37.7749\n", + "# Chicago: -87.6298, 41.8781\n", + "# New York: -73.9857, 40.7580\n", + "data = [\n", + " {\n", + " 'user': 'john',\n", + " 'age': 34,\n", + " 'job': 'software engineer',\n", + " 'region': 'us-west',\n", + " 'job_description': 'Designs, develops, and maintains software applications and systems.',\n", + " 'office_location': '-122.4194,37.7749' # San Francisco\n", + " },\n", + " {\n", + " 'user': 'bill',\n", + " 'age': 54,\n", + " 'job': 'engineer',\n", + " 'region': 'us-central',\n", + " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.',\n", + " 'office_location': '-87.6298,41.8781' # Chicago\n", + " },\n", + " {\n", + " 'user': 'mary',\n", + " 'age': 24,\n", + " 'job': 'doctor',\n", + " 'region': 'us-central',\n", + " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", + " 'office_location': '-87.6298,41.8781' # Chicago\n", + " },\n", + " {\n", + " 'user': 'joe',\n", + " 'age': 27,\n", + " 'job': 'dentist',\n", + " 'region': 'us-east',\n", + " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", + " 'office_location': '-73.9857,40.7580' # New York\n", + " },\n", + " {\n", + " 'user': 'stacy',\n", + " 'age': 61,\n", + " 'job': 'project manager',\n", + " 'region': 'us-west',\n", + " 'job_description': 'Plans, organizes, and oversees projects from inception to completion.',\n", + " 'office_location': '-122.4194,37.7749' # San Francisco\n", + " }\n", + "]\n", + "\n", + "data = [\n", + " { \n", + " **d,\n", + " \"job_embedding\": hf.embed(f\"{d['job_description']=} {d['job']=}\"),\n", + " } \n", + " for d in data\n", + "]" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Create a SearchIndex\n", + "\n", + "With the schema and sample dataset ready, create a `SearchIndex`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Bring your own Redis connection instance\n", + "\n", + "This is ideal in scenarios where you have custom settings on the connection instance or if your application will share a connection pool:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.index import SearchIndex\n", + "from redis import Redis\n", + "\n", + "client = Redis.from_url(\"redis://localhost:6379\")\n", + "index = SearchIndex.from_dict(schema, redis_client=client)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Let the index manage the connection instance\n", + "\n", + "This is ideal for simple cases:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "index = SearchIndex.from_dict(schema, redis_url=\"redis://localhost:6379\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Create the index\n", + "\n", + "Now that we are connected to Redis, we need to run the create command." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "index.create(overwrite=True, drop=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load data\n", + "\n", + "Load the sample dataset to Redis.\n", + "\n", + "### Validate data entries on load\n", + "RedisVL uses pydantic validation under the hood to ensure loaded data is valid and conforms to your schema. This setting is optional and can be configured via `validate_on_load=True` in the `SearchIndex` class.\n", + "\n", + "> **Note**: This guide omits `validate_on_load` because GEO fields use `longitude,latitude` format (Redis convention), which differs from the validation expectation. A future RedisVL release will align GEO validation with Redis conventions." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "['user_simple_docs:01KN7Y4J630537VY4Y5D9EZMYX', 'user_simple_docs:01KN7Y4J630537VY4Y5D9EZMYY', 'user_simple_docs:01KN7Y4J630537VY4Y5D9EZMYZ', 'user_simple_docs:01KN7Y4J630537VY4Y5D9EZMZ0', 'user_simple_docs:01KN7Y4J630537VY4Y5D9EZMZ1']\n" + ] + } + ], + "source": [ + "keys = index.load(data)\n", + "\n", + "print(keys)" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Write SQL queries\n", + "\n", + "First, let's test a simple select statement such as the one below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.query import SQLQuery\n", + "\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, age\n", + " FROM user_simple\n", + " WHERE age > 17\n", + " \"\"\"\n", + "\n", + "# Optional sql_redis_options are passed through to sql-redis.\n", + "# schema_cache_strategy balances startup cost vs repeated-query speed:\n", + "# use \"lazy\" (default) to load schemas on demand, or \"load_all\"\n", + "# to preload schemas up front for broader repeated-query workloads.\n", + "sql_query = SQLQuery(\n", + " sql_str, sql_redis_options={\"schema_cache_strategy\": \"lazy\"}\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Check the created query string" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'FT.SEARCH user_simple \"@age:[(17 +inf]\" RETURN 4 user region job age DIALECT 2'" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Executing the query" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[{'user': 'john',\n", + " 'region': 'us-west',\n", + " 'job': 'software engineer',\n", + " 'age': '34'},\n", + " {'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", + " {'user': 'mary', 'region': 'us-central', 'job': 'doctor', 'age': '24'},\n", + " {'user': 'joe', 'region': 'us-east', 'job': 'dentist', 'age': '27'},\n", + " {'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'}]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Query types\n", + "\n", + "### Conditional operators" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@age:[(17 +inf] @region:{us\\-west}\" RETURN 4 user region job age DIALECT 2\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'john',\n", + " 'region': 'us-west',\n", + " 'job': 'software engineer',\n", + " 'age': '34'},\n", + " {'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'}]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT user, region, job, age\n", + " FROM user_simple\n", + " WHERE age > 17 and region = 'us-west'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"((@region:{us\\-west})|(@region:{us\\-central}))\" RETURN 4 user region job age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'mary', 'region': 'us-central', 'job': 'doctor', 'age': '24'},\n", + " {'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", + " {'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'},\n", + " {'user': 'john',\n", + " 'region': 'us-west',\n", + " 'job': 'software engineer',\n", + " 'age': '34'}]" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT user, region, job, age\n", + " FROM user_simple\n", + " WHERE region = 'us-west' or region = 'us-central'\n", + " \"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job:{software engineer|engineer|pancake tester}\" RETURN 4 user region job age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", + " {'user': 'john',\n", + " 'region': 'us-west',\n", + " 'job': 'software engineer',\n", + " 'age': '34'}]" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# job is a tag field therefore this syntax works\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, age\n", + " FROM user_simple\n", + " WHERE job IN ('software engineer', 'engineer', 'pancake tester')\n", + " \"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Text searches\n", + "\n", + "See [the docs](https://redis.io/docs/latest/develop/ai/search-and-query/query/full-text/) for available text queries in Redis.\n", + "\n", + "For more on exact matching see [here](https://redis.io/docs/latest/develop/ai/search-and-query/query/exact-match/).\n", + "\n", + "With `sql-redis >= 0.4.0`, TEXT search operators are explicit:\n", + "\n", + "- `WHERE job_description = 'healthcare including'` for exact phrase matching\n", + "- `WHERE job_description LIKE 'sci%'`, `LIKE '%care'`, or `LIKE '%diagnose%'` for wildcard matching\n", + "- `WHERE fuzzy(job_description, 'diagnose')` for typo-tolerant matching\n", + "- `WHERE fulltext(job_description, 'healthcare OR diagnosing')` for tokenized search\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:sci*\" RETURN 5 user region job job_description age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'bill',\n", + " 'region': 'us-central',\n", + " 'job': 'engineer',\n", + " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.',\n", + " 'age': '54'}]" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Prefix (LIKE)\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, job_description, age\n", + " FROM user_simple\n", + " WHERE job_description LIKE 'sci%'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:*care\" RETURN 5 user region job job_description age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'mary',\n", + " 'region': 'us-central',\n", + " 'job': 'doctor',\n", + " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", + " 'age': '24'},\n", + " {'user': 'joe',\n", + " 'region': 'us-east',\n", + " 'job': 'dentist',\n", + " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", + " 'age': '27'}]" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Suffix (LIKE)\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, job_description, age\n", + " FROM user_simple\n", + " WHERE job_description LIKE '%care'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:*diagnose*\" RETURN 5 user region job job_description age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'mary',\n", + " 'region': 'us-central',\n", + " 'job': 'doctor',\n", + " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", + " 'age': '24'}]" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Contains (LIKE)\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, job_description, age\n", + " FROM user_simple\n", + " WHERE job_description LIKE '%diagnose%'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:\"healthcare including\"\" RETURN 5 user region job job_description age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'joe',\n", + " 'region': 'us-east',\n", + " 'job': 'dentist',\n", + " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", + " 'age': '27'}]" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Phrase no stop words\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, job_description, age\n", + " FROM user_simple\n", + " WHERE job_description = 'healthcare including'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:\"diagnosing treating\"\" RETURN 5 user region job job_description age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'joe',\n", + " 'region': 'us-east',\n", + " 'job': 'dentist',\n", + " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", + " 'age': '27'}]" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Phrase with stop words (sql-redis strips default stopwords and warns)\n", + "sql_str = \"\"\"\n", + " SELECT user, region, job, job_description, age\n", + " FROM user_simple\n", + " WHERE job_description = 'diagnosing and treating'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@age:[40 60]\" RETURN 4 user region job age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'}]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT user, region, job, age\n", + " FROM user_simple\n", + " WHERE age BETWEEN 40 and 60\n", + " \"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Aggregations\n", + "\n", + "See docs for redis supported reducer functions: [docs](https://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/aggregations/#supported-groupby-reducers)." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE user_simple \"*\" LOAD 3 @age @region @user GROUPBY 1 @region REDUCE COUNT 0 AS count_age REDUCE COUNT_DISTINCT 1 @age AS count_distinct_age REDUCE MIN 1 @age AS min_age REDUCE MAX 1 @age AS max_age REDUCE AVG 1 @age AS avg_age REDUCE STDDEV 1 @age AS std_age REDUCE FIRST_VALUE 1 @age AS fist_value_age REDUCE TOLIST 1 @age AS to_list_age REDUCE QUANTILE 2 @age 0.99 AS quantile_age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'region': 'us-west',\n", + " 'count_age': '2',\n", + " 'count_distinct_age': '2',\n", + " 'min_age': '34',\n", + " 'max_age': '61',\n", + " 'avg_age': '47.5',\n", + " 'std_age': '19.091883092',\n", + " 'fist_value_age': '61',\n", + " 'to_list_age': ['34', '61'],\n", + " 'quantile_age': '61'},\n", + " {'region': 'us-central',\n", + " 'count_age': '2',\n", + " 'count_distinct_age': '2',\n", + " 'min_age': '24',\n", + " 'max_age': '54',\n", + " 'avg_age': '39',\n", + " 'std_age': '21.2132034356',\n", + " 'fist_value_age': '24',\n", + " 'to_list_age': ['24', '54'],\n", + " 'quantile_age': '54'},\n", + " {'region': 'us-east',\n", + " 'count_age': '1',\n", + " 'count_distinct_age': '1',\n", + " 'min_age': '27',\n", + " 'max_age': '27',\n", + " 'avg_age': '27',\n", + " 'std_age': '0',\n", + " 'fist_value_age': '27',\n", + " 'to_list_age': ['27'],\n", + " 'quantile_age': '27'}]" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT\n", + " user,\n", + " COUNT(age) as count_age,\n", + " COUNT_DISTINCT(age) as count_distinct_age,\n", + " MIN(age) as min_age,\n", + " MAX(age) as max_age,\n", + " AVG(age) as avg_age,\n", + " STDEV(age) as std_age,\n", + " FIRST_VALUE(age) as fist_value_age,\n", + " ARRAY_AGG(age) as to_list_age,\n", + " QUANTILE(age, 0.99) as quantile_age\n", + " FROM user_simple\n", + " GROUP BY region\n", + " \"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Vector search" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"*=>[KNN 10 @job_embedding $vector AS vector_distance]\" PARAMS 2 vector $vector DIALECT 2 RETURN 4 user job job_description vector_distance SORTBY vector_distance ASC\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'vector_distance': '0.823510587215',\n", + " 'user': 'bill',\n", + " 'job': 'engineer',\n", + " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.'},\n", + " {'vector_distance': '0.965160369873',\n", + " 'user': 'john',\n", + " 'job': 'software engineer',\n", + " 'job_description': 'Designs, develops, and maintains software applications and systems.'},\n", + " {'vector_distance': '1.00401353836',\n", + " 'user': 'mary',\n", + " 'job': 'doctor',\n", + " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.'},\n", + " {'vector_distance': '1.0062687397',\n", + " 'user': 'stacy',\n", + " 'job': 'project manager',\n", + " 'job_description': 'Plans, organizes, and oversees projects from inception to completion.'},\n", + " {'vector_distance': '1.01110625267',\n", + " 'user': 'joe',\n", + " 'job': 'dentist',\n", + " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.'}]" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT user, job, job_description, cosine_distance(job_embedding, :vec) AS vector_distance\n", + " FROM user_simple\n", + " ORDER BY vector_distance ASC\n", + " \"\"\"\n", + "\n", + "vec = hf.embed(\"looking for someone to use base principles to solve problems\", as_buffer=True)\n", + "sql_query = SQLQuery(sql_str, params={\"vec\": vec})\n", + "\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"(@region:{us\\-central})=>[KNN 10 @job_embedding $vector AS vector_distance]\" PARAMS 2 vector $vector DIALECT 2 RETURN 3 user region vector_distance SORTBY vector_distance ASC\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'vector_distance': '0.823510587215', 'user': 'bill', 'region': 'us-central'},\n", + " {'vector_distance': '1.00401353836', 'user': 'mary', 'region': 'us-central'}]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sql_str = \"\"\"\n", + " SELECT user, region, cosine_distance(job_embedding, :vec) AS vector_distance\n", + " FROM user_simple\n", + " WHERE region = 'us-central'\n", + " ORDER BY vector_distance ASC\n", + " \"\"\"\n", + "\n", + "vec = hf.embed(\"looking for someone to use base principles to solve problems\", as_buffer=True)\n", + "sql_query = SQLQuery(sql_str, params={\"vec\": vec})\n", + "\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "\n", + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Geographic queries\n", + "\n", + "Use `geo_distance()` to filter by location or calculate distances between points.\n", + "\n", + "**Syntax:**\n", + "- Filter: `WHERE geo_distance(field, POINT(lon, lat), 'unit') < radius`\n", + "- Distance: `SELECT geo_distance(field, POINT(lon, lat)) AS distance`\n", + "\n", + "**Units:** `'km'` (kilometers), `'mi'` (miles), `'m'` (meters), `'ft'` (feet)\n", + "\n", + "**Note:** `POINT()` uses longitude first, then latitude - matching Redis conventions." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"*\" GEOFILTER office_location -122.4194 37.7749 500.0 km RETURN 4 user job region office_location\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'stacy',\n", + " 'job': 'project manager',\n", + " 'region': 'us-west',\n", + " 'office_location': '-122.4194,37.7749'},\n", + " {'user': 'john',\n", + " 'job': 'software engineer',\n", + " 'region': 'us-west',\n", + " 'office_location': '-122.4194,37.7749'}]" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Find users within 500km of San Francisco\n", + "sql_str = \"\"\"\n", + " SELECT user, job, region, office_location\n", + " FROM user_simple\n", + " WHERE geo_distance(office_location, POINT(-122.4194, 37.7749), 'km') < 500\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"*\" GEOFILTER office_location -87.6298 41.8781 50.0 mi RETURN 3 user job region\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'mary', 'job': 'doctor', 'region': 'us-central'},\n", + " {'user': 'bill', 'job': 'engineer', 'region': 'us-central'}]" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Find users within 50 miles of Chicago (using miles)\n", + "sql_str = \"\"\"\n", + " SELECT user, job, region\n", + " FROM user_simple\n", + " WHERE geo_distance(office_location, POINT(-87.6298, 41.8781), 'mi') < 50\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job:{engineer}\" GEOFILTER office_location -87.6298 41.8781 50.0 mi RETURN 3 user job region\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'bill', 'job': 'engineer', 'region': 'us-central'}]" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Combine GEO filter with TAG filter - find engineers near Chicago\n", + "sql_str = \"\"\"\n", + " SELECT user, job, region\n", + " FROM user_simple\n", + " WHERE job = 'engineer' AND geo_distance(office_location, POINT(-87.6298, 41.8781), 'mi') < 50\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@age:[(30 +inf]\" GEOFILTER office_location -122.4194 37.7749 100.0 km RETURN 3 user job age\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'stacy', 'job': 'project manager', 'age': '61'},\n", + " {'user': 'john', 'job': 'software engineer', 'age': '34'}]" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Combine GEO with NUMERIC filter - find users over 30 near San Francisco\n", + "sql_str = \"\"\"\n", + " SELECT user, job, age\n", + " FROM user_simple\n", + " WHERE age > 30 AND geo_distance(office_location, POINT(-122.4194, 37.7749), 'km') < 100\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH user_simple \"@job_description:technical*\" GEOFILTER office_location -87.6298 41.8781 100.0 km RETURN 3 user job job_description\n" + ] + }, + { + "data": { + "text/plain": [ + "[{'user': 'bill',\n", + " 'job': 'engineer',\n", + " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.'}]" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Combine GEO with TEXT search - find users with \"technical\" in job description near Chicago\n", + "sql_str = \"\"\"\n", + " SELECT user, job, job_description\n", + " FROM user_simple\n", + " WHERE job_description LIKE 'technical%' AND geo_distance(office_location, POINT(-87.6298, 41.8781), 'km') < 100\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "results" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE user_simple \"*\" LOAD 3 @office_location @region @user APPLY geodistance(@office_location, -73.9857, 40.758) AS distance_meters\n", + "\n", + "Distances from NYC:\n", + " joe | us-east | 0 km\n", + " mary | us-central | 1,145 km\n", + " bill | us-central | 1,145 km\n", + " stacy | us-west | 4,131 km\n", + " john | us-west | 4,131 km\n" + ] + } + ], + "source": [ + "# Calculate distances from New York to all users\n", + "# Note: geo_distance() in SELECT uses FT.AGGREGATE and returns distance in meters\n", + "sql_str = \"\"\"\n", + " SELECT user, region, geo_distance(office_location, POINT(-73.9857, 40.7580)) AS distance_meters\n", + " FROM user_simple\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = index.query(sql_query)\n", + "\n", + "# Convert meters to km for readability and sort by distance\n", + "print(\"\\nDistances from NYC:\")\n", + "for r in sorted(results, key=lambda x: float(x.get('distance_meters', 0))):\n", + " dist_km = float(r.get('distance_meters', 0)) / 1000\n", + " print(f\" {r['user']:10} | {r['region']:12} | {dist_km:,.0f} km\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### GEO Query Summary\n", + "\n", + "| Method | Pattern | Example |\n", + "|--------|---------|---------|\n", + "| **SQL - Basic radius** | `WHERE geo_distance(field, POINT(lon, lat), 'unit') < radius` | `WHERE geo_distance(location, POINT(-122.4, 37.8), 'km') < 50` |\n", + "| **SQL - With miles** | Same with `'mi'` unit | `WHERE geo_distance(location, POINT(-73.9, 40.7), 'mi') < 10` |\n", + "| **SQL - With TAG** | Combined with `AND` | `WHERE category = 'retail' AND geo_distance(...) < 100` |\n", + "| **SQL - With NUMERIC** | Combined with `AND` | `WHERE age > 30 AND geo_distance(...) < 100` |\n", + "| **SQL - Distance calc** | `SELECT geo_distance(...)` | `SELECT geo_distance(location, POINT(lon, lat)) AS dist` |\n", + "| **Native - Within** | `Geo(field) == GeoRadius(...)` | `Geo(\"location\") == GeoRadius(-122.4, 37.8, 100, \"km\")` |\n", + "| **Native - Outside** | `Geo(field) != GeoRadius(...)` | `Geo(\"location\") != GeoRadius(-87.6, 41.9, 1000, \"km\")` |\n", + "| **Native - Combined** | Use `&` and `\\|` operators | `geo_filter & tag_filter & num_filter` |\n", + "\n", + "**Key Points:**\n", + "1. **Coordinate Format**: `\"longitude,latitude\"` - longitude first!\n", + "2. **POINT() Syntax**: `POINT(lon, lat)` - longitude first (matches Redis)\n", + "3. **Units**: `'km'`, `'mi'`, `'m'`, `'ft'`\n", + "4. **geo_distance()**: Returns meters, divide by 1000 for km" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Date and datetime queries\n", + "\n", + "Use date literals and functions to query timestamp data. Redis stores dates as Unix timestamps in NUMERIC fields.\n", + "\n", + "**Key Concepts:**\n", + "- Date literals like `'2024-01-01'` are auto-converted to Unix timestamps\n", + "- Date functions (`YEAR()`, `MONTH()`, `DAY()`) extract date parts\n", + "- `DATE_FORMAT()` formats timestamps as readable strings" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Loaded 7 events:\n", + " - New Year Kickoff | 2024-01-01 | meeting\n", + " - Q1 Planning | 2024-01-15 | meeting\n", + " - Product Launch | 2024-02-20 | release\n", + " - Team Offsite | 2024-03-10 | meeting\n", + " - Summer Summit | 2024-07-15 | conference\n", + " - Holiday Party 2023 | 2023-12-15 | conference\n", + " - Year End Review 2023 | 2023-12-20 | meeting\n" + ] + } + ], + "source": [ + "# Create a separate index for date examples\n", + "from datetime import datetime, timezone\n", + "\n", + "def to_timestamp(date_str):\n", + " \"\"\"Convert ISO date string to Unix timestamp (UTC).\"\"\"\n", + " dt = datetime.strptime(date_str, \"%Y-%m-%d\")\n", + " dt = dt.replace(tzinfo=timezone.utc)\n", + " return int(dt.timestamp())\n", + "\n", + "# Define schema with NUMERIC fields for timestamps\n", + "events_schema = {\n", + " \"index\": {\n", + " \"name\": \"events\",\n", + " \"prefix\": \"event:\",\n", + " \"storage_type\": \"hash\",\n", + " },\n", + " \"fields\": [\n", + " {\"name\": \"name\", \"type\": \"text\", \"attrs\": {\"sortable\": True}},\n", + " {\"name\": \"category\", \"type\": \"tag\", \"attrs\": {\"sortable\": True}},\n", + " {\"name\": \"created_at\", \"type\": \"numeric\", \"attrs\": {\"sortable\": True}},\n", + " ],\n", + "}\n", + "\n", + "events_index = SearchIndex.from_dict(events_schema, redis_url=\"redis://localhost:6379\")\n", + "events_index.create(overwrite=True)\n", + "\n", + "# Sample events spanning 2023-2024\n", + "events = [\n", + " {\"name\": \"New Year Kickoff\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-01-01\")},\n", + " {\"name\": \"Q1 Planning\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-01-15\")},\n", + " {\"name\": \"Product Launch\", \"category\": \"release\", \"created_at\": to_timestamp(\"2024-02-20\")},\n", + " {\"name\": \"Team Offsite\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-03-10\")},\n", + " {\"name\": \"Summer Summit\", \"category\": \"conference\", \"created_at\": to_timestamp(\"2024-07-15\")},\n", + " {\"name\": \"Holiday Party 2023\", \"category\": \"conference\", \"created_at\": to_timestamp(\"2023-12-15\")},\n", + " {\"name\": \"Year End Review 2023\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2023-12-20\")},\n", + "]\n", + "\n", + "events_index.load(events)\n", + "\n", + "print(f\"Loaded {len(events)} events:\")\n", + "for e in events:\n", + " date = datetime.fromtimestamp(e[\"created_at\"], tz=timezone.utc).strftime(\"%Y-%m-%d\")\n", + " print(f\" - {e['name']:25} | {date} | {e['category']}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH events \"@created_at:[(1704067200 +inf]\" RETURN 2 name category\n", + "\n", + "Events after 2024-01-01 (4 found):\n", + " - Summer Summit\n", + " - Q1 Planning\n", + " - Team Offsite\n", + " - Product Launch\n" + ] + } + ], + "source": [ + "# Find events after January 1st, 2024 using date literal\n", + "sql_str = \"\"\"\n", + " SELECT name, category\n", + " FROM events\n", + " WHERE created_at > '2024-01-01'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(f\"\\nEvents after 2024-01-01 ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['name']}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.SEARCH events \"@created_at:[1704067200 1711843200]\" RETURN 2 name category\n", + "\n", + "Events in Q1 2024 (4 found):\n", + " - Q1 Planning (meeting)\n", + " - New Year Kickoff (meeting)\n", + " - Team Offsite (meeting)\n", + " - Product Launch (release)\n" + ] + } + ], + "source": [ + "# Find events in Q1 2024 using BETWEEN\n", + "sql_str = \"\"\"\n", + " SELECT name, category\n", + " FROM events\n", + " WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(f\"\\nEvents in Q1 2024 ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['name']} ({r['category']})\")" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Meetings in H1 2024 (3 found):\n", + " - Q1 Planning\n", + " - New Year Kickoff\n", + " - Team Offsite\n" + ] + } + ], + "source": [ + "# Combine date filter with TAG filter - find meetings in H1 2024\n", + "sql_str = \"\"\"\n", + " SELECT name\n", + " FROM events\n", + " WHERE category = 'meeting' AND created_at BETWEEN '2024-01-01' AND '2024-06-30'\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(f\"Meetings in H1 2024 ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['name']}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Date Query Summary\n", + "\n", + "| Pattern | Example |\n", + "|---------|---------|\n", + "| **After date** | `WHERE created_at > '2024-01-01'` |\n", + "| **Before date** | `WHERE created_at < '2024-12-31'` |\n", + "| **Date range** | `WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31'` |\n", + "| **Extract year** | `SELECT YEAR(created_at) AS year` |\n", + "| **Extract month** | `SELECT MONTH(created_at) AS month` (returns 0-11) |\n", + "| **Filter by year** | `WHERE YEAR(created_at) = 2024` |\n", + "| **Group by date** | `GROUP BY YEAR(created_at)` |\n", + "| **Format date** | `DATE_FORMAT(created_at, '%Y-%m-%d')` |\n", + "\n", + "**Key Points:**\n", + "1. **Storage**: Dates stored as Unix timestamps in NUMERIC fields\n", + "2. **Date Literals**: ISO 8601 strings auto-converted to timestamps\n", + "3. **Timezone**: Dates without timezone are treated as UTC\n", + "4. **Month Index**: Redis `MONTH()` returns 0-11, not 1-12" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Async support\n", + "\n", + "SQL queries also work with `AsyncSearchIndex` for async applications:" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [], + "source": [ + "from redisvl.index import AsyncSearchIndex\n", + "from redisvl.query import SQLQuery\n", + "\n", + "# Create async index\n", + "async_index = AsyncSearchIndex.from_dict(schema, redis_url=\"redis://localhost:6379\")\n", + "\n", + "# Execute SQL query asynchronously\n", + "sql_query = SQLQuery(f\"SELECT user, age FROM {async_index.name} WHERE age > 30\")\n", + "results = await async_index.query(sql_query)\n", + "\n", + "# Cleanup\n", + "await async_index.disconnect()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Additional Query Examples\n", + "\n", + "The following sections provide more detailed examples for geographic and date queries." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "execution": { + "iopub.execute_input": "2026-02-16T15:20:20.243553Z", + "iopub.status.busy": "2026-02-16T15:20:20.243464Z", + "iopub.status.idle": "2026-02-16T15:20:20.245944Z", + "shell.execute_reply": "2026-02-16T15:20:20.245506Z" + } + }, + "source": [ + "### Native GEO filters\n", + "\n", + "As an alternative to SQL syntax, RedisVL provides native `Geo` and `GeoRadius` filter classes.\n", + "These can be combined with other filters using `&` (AND) and `|` (OR) operators." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Filter expression: @office_location:[-87.6298 41.8781 100 km]\n", + "\n", + "Users within 100km of Chicago (2 found):\n", + " - mary (doctor) - us-central\n", + " - bill (engineer) - us-central\n" + ] + } + ], + "source": [ + "from redisvl.query import FilterQuery\n", + "from redisvl.query.filter import Geo, GeoRadius, Tag, Num\n", + "\n", + "# Find users within 100km of Chicago using native filters\n", + "geo_filter = Geo(\"office_location\") == GeoRadius(-87.6298, 41.8781, 100, \"km\")\n", + "\n", + "print(f\"Filter expression: {geo_filter}\\n\")\n", + "\n", + "query = FilterQuery(\n", + " filter_expression=geo_filter,\n", + " return_fields=[\"user\", \"job\", \"region\"]\n", + ")\n", + "\n", + "results = index.query(query)\n", + "print(f\"Users within 100km of Chicago ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['user']} ({r['job']}) - {r['region']}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Filter expression: (-@office_location:[-87.6298 41.8781 1000 km])\n", + "\n", + "Users OUTSIDE 1000km of Chicago (3 found):\n", + " - joe (us-east)\n", + " - stacy (us-west)\n", + " - john (us-west)\n" + ] + } + ], + "source": [ + "# Find users OUTSIDE 1000km of Chicago (using !=)\n", + "geo_filter_outside = Geo(\"office_location\") != GeoRadius(-87.6298, 41.8781, 1000, \"km\")\n", + "\n", + "print(f\"Filter expression: {geo_filter_outside}\\n\")\n", + "\n", + "query = FilterQuery(\n", + " filter_expression=geo_filter_outside,\n", + " return_fields=[\"user\", \"region\"]\n", + ")\n", + "\n", + "results = index.query(query)\n", + "print(f\"Users OUTSIDE 1000km of Chicago ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['user']} ({r['region']})\")" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Combined filter: ((@office_location:[-87.6298 41.8781 500 km] @job:{engineer}) @age:[(40 +inf])\n", + "\n", + "Engineers over 40 within 500km of Chicago (1 found):\n", + " - bill (age: 54) - us-central\n" + ] + } + ], + "source": [ + "# Combine GEO + TAG + NUMERIC filters\n", + "# Find engineers over 40 within 500km of Chicago\n", + "geo_filter = Geo(\"office_location\") == GeoRadius(-87.6298, 41.8781, 500, \"km\")\n", + "job_filter = Tag(\"job\") == \"engineer\"\n", + "age_filter = Num(\"age\") > 40\n", + "\n", + "combined_filter = geo_filter & job_filter & age_filter\n", + "\n", + "print(f\"Combined filter: {combined_filter}\\n\")\n", + "\n", + "query = FilterQuery(\n", + " filter_expression=combined_filter,\n", + " return_fields=[\"user\", \"job\", \"age\", \"region\"]\n", + ")\n", + "\n", + "results = index.query(query)\n", + "print(f\"Engineers over 40 within 500km of Chicago ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['user']} (age: {r['age']}) - {r['region']}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Additional Date Examples\n", + "\n", + "More advanced date query patterns including date function extraction and formatting." + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY year(@created_at) AS year APPLY monthofyear(@created_at) AS month\n", + "\n", + "Events with year/month:\n", + " - Summer Summit | 2024-07\n", + " - Q1 Planning | 2024-01\n", + " - Year End Review 2023 | 2023-12\n", + " - New Year Kickoff | 2024-01\n", + " - Holiday Party 2023 | 2023-12\n", + " - Team Offsite | 2024-03\n", + " - Product Launch | 2024-02\n" + ] + } + ], + "source": [ + "# Extract YEAR and MONTH using date functions in SELECT\n", + "sql_str = \"\"\"\n", + " SELECT name, YEAR(created_at) AS year, MONTH(created_at) AS month\n", + " FROM events\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(f\"\\nEvents with year/month:\")\n", + "for r in results:\n", + " # Note: MONTH returns 0-11 in Redis (0=January)\n", + " month_num = int(r.get('month', 0)) + 1\n", + " print(f\" - {r['name']:25} | {r.get('year')}-{month_num:02d}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY year(@created_at) AS year_created_at FILTER @year_created_at == 2024\n", + "\n", + "Events in 2024 (5 found):\n", + " - Summer Summit\n", + " - Q1 Planning\n", + " - New Year Kickoff\n", + " - Team Offsite\n", + " - Product Launch\n" + ] + } + ], + "source": [ + "# Filter by YEAR using date function in WHERE\n", + "sql_str = \"\"\"\n", + " SELECT name\n", + " FROM events\n", + " WHERE YEAR(created_at) = 2024\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(f\"\\nEvents in 2024 ({len(results)} found):\")\n", + "for r in results:\n", + " print(f\" - {r['name']}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @year APPLY year(@created_at) AS year GROUPBY 1 @year REDUCE COUNT 0 AS event_count\n", + "\n", + "Events per year:\n", + " 2023: 2 events\n", + " 2024: 5 events\n" + ] + } + ], + "source": [ + "# Count events per year using GROUP BY\n", + "sql_str = \"\"\"\n", + " SELECT YEAR(created_at) AS year, COUNT(*) AS event_count\n", + " FROM events\n", + " GROUP BY year\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(\"\\nEvents per year:\")\n", + "for r in sorted(results, key=lambda x: x.get('year', 0)):\n", + " print(f\" {r['year']}: {r['event_count']} events\")" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY timefmt(@created_at, \"%Y-%m-%d\") AS event_date\n", + "\n", + "Events with formatted dates:\n", + " - Summer Summit | 2024-07-15\n", + " - Q1 Planning | 2024-01-15\n", + " - Year End Review 2023 | 2023-12-20\n", + " - New Year Kickoff | 2024-01-01\n", + " - Holiday Party 2023 | 2023-12-15\n", + " - Team Offsite | 2024-03-10\n", + " - Product Launch | 2024-02-20\n" + ] + } + ], + "source": [ + "# Format dates using DATE_FORMAT\n", + "sql_str = \"\"\"\n", + " SELECT name, DATE_FORMAT(created_at, '%Y-%m-%d') AS event_date\n", + " FROM events\n", + "\"\"\"\n", + "\n", + "sql_query = SQLQuery(sql_str)\n", + "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", + "print(\"Resulting redis query: \", redis_query)\n", + "results = events_index.query(sql_query)\n", + "\n", + "print(\"\\nEvents with formatted dates:\")\n", + "for r in results:\n", + " print(f\" - {r['name']:25} | {r.get('event_date', 'N/A')}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Next Steps\n", + "\n", + "Now that you understand SQL queries for Redis, explore these related guides:\n", + "\n", + "- [Use Advanced Query Types](11_advanced_queries.ipynb) - Learn about TextQuery, HybridQuery, and MultiVectorQuery\n", + "- [Query and Filter Data](02_complex_filtering.ipynb) - Apply filters using native RedisVL query syntax\n", + "- [Getting Started](01_getting_started.ipynb) - Review the basics of RedisVL indexes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cleanup\n", + "\n", + "To remove all data from Redis associated with the index, use the `.clear()` method. This leaves the index in place for future insertions or updates.\n", + "\n", + "To remove everything including the index, use `.delete()` which removes both the index and the underlying data." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": {}, + "outputs": [], + "source": [ + "# Delete both indexes and all associated data\n", + "events_index.delete(drop=True)\n", + "index.delete(drop=True)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "redis-vl-python (3.11.9)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } }, - { - "data": { - "text/plain": [ - "[{'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'},\n", - " {'user': 'john',\n", - " 'region': 'us-west',\n", - " 'job': 'software engineer',\n", - " 'age': '34'}]" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT user, region, job, age\n", - " FROM user_simple\n", - " WHERE age > 17 and region = 'us-west'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"((@region:{us\\-west})|(@region:{us\\-central}))\" RETURN 4 user region job age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'mary', 'region': 'us-central', 'job': 'doctor', 'age': '24'},\n", - " {'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", - " {'user': 'stacy', 'region': 'us-west', 'job': 'project manager', 'age': '61'},\n", - " {'user': 'john',\n", - " 'region': 'us-west',\n", - " 'job': 'software engineer',\n", - " 'age': '34'}]" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT user, region, job, age\n", - " FROM user_simple\n", - " WHERE region = 'us-west' or region = 'us-central'\n", - " \"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job:{software engineer|engineer|pancake tester}\" RETURN 4 user region job age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'},\n", - " {'user': 'john',\n", - " 'region': 'us-west',\n", - " 'job': 'software engineer',\n", - " 'age': '34'}]" - ] - }, - "execution_count": 12, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# job is a tag field therefore this syntax works\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, age\n", - " FROM user_simple\n", - " WHERE job IN ('software engineer', 'engineer', 'pancake tester')\n", - " \"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Text searches\n", - "\n", - "See [the docs](https://redis.io/docs/latest/develop/ai/search-and-query/query/full-text/) for available text queries in Redis.\n", - "\n", - "For more on exact matching see [here](https://redis.io/docs/latest/develop/ai/search-and-query/query/exact-match/)" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:sci*\" RETURN 5 user region job job_description age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'bill',\n", - " 'region': 'us-central',\n", - " 'job': 'engineer',\n", - " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.',\n", - " 'age': '54'}]" - ] - }, - "execution_count": 13, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Prefix\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, job_description, age\n", - " FROM user_simple\n", - " WHERE job_description = 'sci*'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:*care\" RETURN 5 user region job job_description age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'mary',\n", - " 'region': 'us-central',\n", - " 'job': 'doctor',\n", - " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", - " 'age': '24'},\n", - " {'user': 'joe',\n", - " 'region': 'us-east',\n", - " 'job': 'dentist',\n", - " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", - " 'age': '27'}]" - ] - }, - "execution_count": 14, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Suffix\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, job_description, age\n", - " FROM user_simple\n", - " WHERE job_description = '*care'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:%diagnose%\" RETURN 5 user region job job_description age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'mary',\n", - " 'region': 'us-central',\n", - " 'job': 'doctor',\n", - " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.',\n", - " 'age': '24'}]" - ] - }, - "execution_count": 15, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Fuzzy\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, job_description, age\n", - " FROM user_simple\n", - " WHERE job_description = '%diagnose%'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:\"healthcare including\"\" RETURN 5 user region job job_description age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'joe',\n", - " 'region': 'us-east',\n", - " 'job': 'dentist',\n", - " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", - " 'age': '27'}]" - ] - }, - "execution_count": 16, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Phrase no stop words\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, job_description, age\n", - " FROM user_simple\n", - " WHERE job_description = 'healthcare including'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:\"diagnosing treating\"\" RETURN 5 user region job job_description age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'joe',\n", - " 'region': 'us-east',\n", - " 'job': 'dentist',\n", - " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.',\n", - " 'age': '27'}]" - ] - }, - "execution_count": 17, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Phrase with stop words currently limitation of core Redis\n", - "sql_str = \"\"\"\n", - " SELECT user, region, job, job_description, age\n", - " FROM user_simple\n", - " WHERE job_description = 'diagnosing and treating'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 18, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@age:[40 60]\" RETURN 4 user region job age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'bill', 'region': 'us-central', 'job': 'engineer', 'age': '54'}]" - ] - }, - "execution_count": 18, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT user, region, job, age\n", - " FROM user_simple\n", - " WHERE age BETWEEN 40 and 60\n", - " \"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Aggregations\n", - "\n", - "See docs for redis supported reducer functions: [docs](https://redis.io/docs/latest/develop/ai/search-and-query/advanced-concepts/aggregations/#supported-groupby-reducers)." - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE user_simple \"*\" LOAD 3 @age @region @user GROUPBY 1 @region REDUCE COUNT 0 AS count_age REDUCE COUNT_DISTINCT 1 @age AS count_distinct_age REDUCE MIN 1 @age AS min_age REDUCE MAX 1 @age AS max_age REDUCE AVG 1 @age AS avg_age REDUCE STDDEV 1 @age AS std_age REDUCE FIRST_VALUE 1 @age AS fist_value_age REDUCE TOLIST 1 @age AS to_list_age REDUCE QUANTILE 2 @age 0.99 AS quantile_age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'region': 'us-west',\n", - " 'count_age': '2',\n", - " 'count_distinct_age': '2',\n", - " 'min_age': '34',\n", - " 'max_age': '61',\n", - " 'avg_age': '47.5',\n", - " 'std_age': '19.091883092',\n", - " 'fist_value_age': '61',\n", - " 'to_list_age': ['34', '61'],\n", - " 'quantile_age': '61'},\n", - " {'region': 'us-central',\n", - " 'count_age': '2',\n", - " 'count_distinct_age': '2',\n", - " 'min_age': '24',\n", - " 'max_age': '54',\n", - " 'avg_age': '39',\n", - " 'std_age': '21.2132034356',\n", - " 'fist_value_age': '24',\n", - " 'to_list_age': ['24', '54'],\n", - " 'quantile_age': '54'},\n", - " {'region': 'us-east',\n", - " 'count_age': '1',\n", - " 'count_distinct_age': '1',\n", - " 'min_age': '27',\n", - " 'max_age': '27',\n", - " 'avg_age': '27',\n", - " 'std_age': '0',\n", - " 'fist_value_age': '27',\n", - " 'to_list_age': ['27'],\n", - " 'quantile_age': '27'}]" - ] - }, - "execution_count": 19, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT\n", - " user,\n", - " COUNT(age) as count_age,\n", - " COUNT_DISTINCT(age) as count_distinct_age,\n", - " MIN(age) as min_age,\n", - " MAX(age) as max_age,\n", - " AVG(age) as avg_age,\n", - " STDEV(age) as std_age,\n", - " FIRST_VALUE(age) as fist_value_age,\n", - " ARRAY_AGG(age) as to_list_age,\n", - " QUANTILE(age, 0.99) as quantile_age\n", - " FROM user_simple\n", - " GROUP BY region\n", - " \"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Vector search" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"*=>[KNN 10 @job_embedding $vector AS vector_distance]\" PARAMS 2 vector $vector DIALECT 2 RETURN 4 user job job_description vector_distance SORTBY vector_distance ASC\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'vector_distance': '0.823510587215',\n", - " 'user': 'bill',\n", - " 'job': 'engineer',\n", - " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.'},\n", - " {'vector_distance': '0.965160369873',\n", - " 'user': 'john',\n", - " 'job': 'software engineer',\n", - " 'job_description': 'Designs, develops, and maintains software applications and systems.'},\n", - " {'vector_distance': '1.00401353836',\n", - " 'user': 'mary',\n", - " 'job': 'doctor',\n", - " 'job_description': 'Diagnoses and treats illnesses, injuries, and other medical conditions in the healthcare field.'},\n", - " {'vector_distance': '1.0062687397',\n", - " 'user': 'stacy',\n", - " 'job': 'project manager',\n", - " 'job_description': 'Plans, organizes, and oversees projects from inception to completion.'},\n", - " {'vector_distance': '1.01110625267',\n", - " 'user': 'joe',\n", - " 'job': 'dentist',\n", - " 'job_description': 'Provides oral healthcare including diagnosing and treating teeth and gum issues.'}]" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT user, job, job_description, cosine_distance(job_embedding, :vec) AS vector_distance\n", - " FROM user_simple\n", - " ORDER BY vector_distance ASC\n", - " \"\"\"\n", - "\n", - "vec = hf.embed(\"looking for someone to use base principles to solve problems\", as_buffer=True)\n", - "sql_query = SQLQuery(sql_str, params={\"vec\": vec})\n", - "\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"(@region:{us\\-central})=>[KNN 10 @job_embedding $vector AS vector_distance]\" PARAMS 2 vector $vector DIALECT 2 RETURN 3 user region vector_distance SORTBY vector_distance ASC\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'vector_distance': '0.823510587215', 'user': 'bill', 'region': 'us-central'},\n", - " {'vector_distance': '1.00401353836', 'user': 'mary', 'region': 'us-central'}]" - ] - }, - "execution_count": 21, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "sql_str = \"\"\"\n", - " SELECT user, region, cosine_distance(job_embedding, :vec) AS vector_distance\n", - " FROM user_simple\n", - " WHERE region = 'us-central'\n", - " ORDER BY vector_distance ASC\n", - " \"\"\"\n", - "\n", - "vec = hf.embed(\"looking for someone to use base principles to solve problems\", as_buffer=True)\n", - "sql_query = SQLQuery(sql_str, params={\"vec\": vec})\n", - "\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "\n", - "results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Geographic queries\n", - "\n", - "Use `geo_distance()` to filter by location or calculate distances between points.\n", - "\n", - "**Syntax:**\n", - "- Filter: `WHERE geo_distance(field, POINT(lon, lat), 'unit') < radius`\n", - "- Distance: `SELECT geo_distance(field, POINT(lon, lat)) AS distance`\n", - "\n", - "**Units:** `'km'` (kilometers), `'mi'` (miles), `'m'` (meters), `'ft'` (feet)\n", - "\n", - "**Note:** `POINT()` uses longitude first, then latitude - matching Redis conventions." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"*\" GEOFILTER office_location -122.4194 37.7749 500.0 km RETURN 4 user job region office_location\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'stacy',\n", - " 'job': 'project manager',\n", - " 'region': 'us-west',\n", - " 'office_location': '-122.4194,37.7749'},\n", - " {'user': 'john',\n", - " 'job': 'software engineer',\n", - " 'region': 'us-west',\n", - " 'office_location': '-122.4194,37.7749'}]" - ] - }, - "execution_count": 22, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Find users within 500km of San Francisco\n", - "sql_str = \"\"\"\n", - " SELECT user, job, region, office_location\n", - " FROM user_simple\n", - " WHERE geo_distance(office_location, POINT(-122.4194, 37.7749), 'km') < 500\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"*\" GEOFILTER office_location -87.6298 41.8781 50.0 mi RETURN 3 user job region\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'mary', 'job': 'doctor', 'region': 'us-central'},\n", - " {'user': 'bill', 'job': 'engineer', 'region': 'us-central'}]" - ] - }, - "execution_count": 23, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Find users within 50 miles of Chicago (using miles)\n", - "sql_str = \"\"\"\n", - " SELECT user, job, region\n", - " FROM user_simple\n", - " WHERE geo_distance(office_location, POINT(-87.6298, 41.8781), 'mi') < 50\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job:{engineer}\" GEOFILTER office_location -87.6298 41.8781 50.0 mi RETURN 3 user job region\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'bill', 'job': 'engineer', 'region': 'us-central'}]" - ] - }, - "execution_count": 24, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Combine GEO filter with TAG filter - find engineers near Chicago\n", - "sql_str = \"\"\"\n", - " SELECT user, job, region\n", - " FROM user_simple\n", - " WHERE job = 'engineer' AND geo_distance(office_location, POINT(-87.6298, 41.8781), 'mi') < 50\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@age:[(30 +inf]\" GEOFILTER office_location -122.4194 37.7749 100.0 km RETURN 3 user job age\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'stacy', 'job': 'project manager', 'age': '61'},\n", - " {'user': 'john', 'job': 'software engineer', 'age': '34'}]" - ] - }, - "execution_count": 25, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Combine GEO with NUMERIC filter - find users over 30 near San Francisco\n", - "sql_str = \"\"\"\n", - " SELECT user, job, age\n", - " FROM user_simple\n", - " WHERE age > 30 AND geo_distance(office_location, POINT(-122.4194, 37.7749), 'km') < 100\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH user_simple \"@job_description:technical*\" GEOFILTER office_location -87.6298 41.8781 100.0 km RETURN 3 user job job_description\n" - ] - }, - { - "data": { - "text/plain": [ - "[{'user': 'bill',\n", - " 'job': 'engineer',\n", - " 'job_description': 'Applies scientific and mathematical principles to solve technical problems.'}]" - ] - }, - "execution_count": 26, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Combine GEO with TEXT search - find users with \"technical\" in job description near Chicago\n", - "sql_str = \"\"\"\n", - " SELECT user, job, job_description\n", - " FROM user_simple\n", - " WHERE job_description = 'technical*' AND geo_distance(office_location, POINT(-87.6298, 41.8781), 'km') < 100\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "results" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE user_simple \"*\" LOAD 3 @office_location @region @user APPLY geodistance(@office_location, -73.9857, 40.758) AS distance_meters\n", - "\n", - "Distances from NYC:\n", - " joe | us-east | 0 km\n", - " mary | us-central | 1,145 km\n", - " bill | us-central | 1,145 km\n", - " stacy | us-west | 4,131 km\n", - " john | us-west | 4,131 km\n" - ] - } - ], - "source": [ - "# Calculate distances from New York to all users\n", - "# Note: geo_distance() in SELECT uses FT.AGGREGATE and returns distance in meters\n", - "sql_str = \"\"\"\n", - " SELECT user, region, geo_distance(office_location, POINT(-73.9857, 40.7580)) AS distance_meters\n", - " FROM user_simple\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = index.query(sql_query)\n", - "\n", - "# Convert meters to km for readability and sort by distance\n", - "print(\"\\nDistances from NYC:\")\n", - "for r in sorted(results, key=lambda x: float(x.get('distance_meters', 0))):\n", - " dist_km = float(r.get('distance_meters', 0)) / 1000\n", - " print(f\" {r['user']:10} | {r['region']:12} | {dist_km:,.0f} km\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### GEO Query Summary\n", - "\n", - "| Method | Pattern | Example |\n", - "|--------|---------|---------|\n", - "| **SQL - Basic radius** | `WHERE geo_distance(field, POINT(lon, lat), 'unit') < radius` | `WHERE geo_distance(location, POINT(-122.4, 37.8), 'km') < 50` |\n", - "| **SQL - With miles** | Same with `'mi'` unit | `WHERE geo_distance(location, POINT(-73.9, 40.7), 'mi') < 10` |\n", - "| **SQL - With TAG** | Combined with `AND` | `WHERE category = 'retail' AND geo_distance(...) < 100` |\n", - "| **SQL - With NUMERIC** | Combined with `AND` | `WHERE age > 30 AND geo_distance(...) < 100` |\n", - "| **SQL - Distance calc** | `SELECT geo_distance(...)` | `SELECT geo_distance(location, POINT(lon, lat)) AS dist` |\n", - "| **Native - Within** | `Geo(field) == GeoRadius(...)` | `Geo(\"location\") == GeoRadius(-122.4, 37.8, 100, \"km\")` |\n", - "| **Native - Outside** | `Geo(field) != GeoRadius(...)` | `Geo(\"location\") != GeoRadius(-87.6, 41.9, 1000, \"km\")` |\n", - "| **Native - Combined** | Use `&` and `\\|` operators | `geo_filter & tag_filter & num_filter` |\n", - "\n", - "**Key Points:**\n", - "1. **Coordinate Format**: `\"longitude,latitude\"` - longitude first!\n", - "2. **POINT() Syntax**: `POINT(lon, lat)` - longitude first (matches Redis)\n", - "3. **Units**: `'km'`, `'mi'`, `'m'`, `'ft'`\n", - "4. **geo_distance()**: Returns meters, divide by 1000 for km" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Date and datetime queries\n", - "\n", - "Use date literals and functions to query timestamp data. Redis stores dates as Unix timestamps in NUMERIC fields.\n", - "\n", - "**Key Concepts:**\n", - "- Date literals like `'2024-01-01'` are auto-converted to Unix timestamps\n", - "- Date functions (`YEAR()`, `MONTH()`, `DAY()`) extract date parts\n", - "- `DATE_FORMAT()` formats timestamps as readable strings" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Loaded 7 events:\n", - " - New Year Kickoff | 2024-01-01 | meeting\n", - " - Q1 Planning | 2024-01-15 | meeting\n", - " - Product Launch | 2024-02-20 | release\n", - " - Team Offsite | 2024-03-10 | meeting\n", - " - Summer Summit | 2024-07-15 | conference\n", - " - Holiday Party 2023 | 2023-12-15 | conference\n", - " - Year End Review 2023 | 2023-12-20 | meeting\n" - ] - } - ], - "source": [ - "# Create a separate index for date examples\n", - "from datetime import datetime, timezone\n", - "\n", - "def to_timestamp(date_str):\n", - " \"\"\"Convert ISO date string to Unix timestamp (UTC).\"\"\"\n", - " dt = datetime.strptime(date_str, \"%Y-%m-%d\")\n", - " dt = dt.replace(tzinfo=timezone.utc)\n", - " return int(dt.timestamp())\n", - "\n", - "# Define schema with NUMERIC fields for timestamps\n", - "events_schema = {\n", - " \"index\": {\n", - " \"name\": \"events\",\n", - " \"prefix\": \"event:\",\n", - " \"storage_type\": \"hash\",\n", - " },\n", - " \"fields\": [\n", - " {\"name\": \"name\", \"type\": \"text\", \"attrs\": {\"sortable\": True}},\n", - " {\"name\": \"category\", \"type\": \"tag\", \"attrs\": {\"sortable\": True}},\n", - " {\"name\": \"created_at\", \"type\": \"numeric\", \"attrs\": {\"sortable\": True}},\n", - " ],\n", - "}\n", - "\n", - "events_index = SearchIndex.from_dict(events_schema, redis_url=\"redis://localhost:6379\")\n", - "events_index.create(overwrite=True)\n", - "\n", - "# Sample events spanning 2023-2024\n", - "events = [\n", - " {\"name\": \"New Year Kickoff\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-01-01\")},\n", - " {\"name\": \"Q1 Planning\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-01-15\")},\n", - " {\"name\": \"Product Launch\", \"category\": \"release\", \"created_at\": to_timestamp(\"2024-02-20\")},\n", - " {\"name\": \"Team Offsite\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2024-03-10\")},\n", - " {\"name\": \"Summer Summit\", \"category\": \"conference\", \"created_at\": to_timestamp(\"2024-07-15\")},\n", - " {\"name\": \"Holiday Party 2023\", \"category\": \"conference\", \"created_at\": to_timestamp(\"2023-12-15\")},\n", - " {\"name\": \"Year End Review 2023\", \"category\": \"meeting\", \"created_at\": to_timestamp(\"2023-12-20\")},\n", - "]\n", - "\n", - "events_index.load(events)\n", - "\n", - "print(f\"Loaded {len(events)} events:\")\n", - "for e in events:\n", - " date = datetime.fromtimestamp(e[\"created_at\"], tz=timezone.utc).strftime(\"%Y-%m-%d\")\n", - " print(f\" - {e['name']:25} | {date} | {e['category']}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 29, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH events \"@created_at:[(1704067200 +inf]\" RETURN 2 name category\n", - "\n", - "Events after 2024-01-01 (4 found):\n", - " - Summer Summit\n", - " - Q1 Planning\n", - " - Team Offsite\n", - " - Product Launch\n" - ] - } - ], - "source": [ - "# Find events after January 1st, 2024 using date literal\n", - "sql_str = \"\"\"\n", - " SELECT name, category\n", - " FROM events\n", - " WHERE created_at > '2024-01-01'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(f\"\\nEvents after 2024-01-01 ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['name']}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.SEARCH events \"@created_at:[1704067200 1711843200]\" RETURN 2 name category\n", - "\n", - "Events in Q1 2024 (4 found):\n", - " - Q1 Planning (meeting)\n", - " - New Year Kickoff (meeting)\n", - " - Team Offsite (meeting)\n", - " - Product Launch (release)\n" - ] - } - ], - "source": [ - "# Find events in Q1 2024 using BETWEEN\n", - "sql_str = \"\"\"\n", - " SELECT name, category\n", - " FROM events\n", - " WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(f\"\\nEvents in Q1 2024 ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['name']} ({r['category']})\")" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Meetings in H1 2024 (3 found):\n", - " - Q1 Planning\n", - " - New Year Kickoff\n", - " - Team Offsite\n" - ] - } - ], - "source": [ - "# Combine date filter with TAG filter - find meetings in H1 2024\n", - "sql_str = \"\"\"\n", - " SELECT name\n", - " FROM events\n", - " WHERE category = 'meeting' AND created_at BETWEEN '2024-01-01' AND '2024-06-30'\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(f\"Meetings in H1 2024 ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['name']}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Date Query Summary\n", - "\n", - "| Pattern | Example |\n", - "|---------|---------|\n", - "| **After date** | `WHERE created_at > '2024-01-01'` |\n", - "| **Before date** | `WHERE created_at < '2024-12-31'` |\n", - "| **Date range** | `WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31'` |\n", - "| **Extract year** | `SELECT YEAR(created_at) AS year` |\n", - "| **Extract month** | `SELECT MONTH(created_at) AS month` (returns 0-11) |\n", - "| **Filter by year** | `WHERE YEAR(created_at) = 2024` |\n", - "| **Group by date** | `GROUP BY YEAR(created_at)` |\n", - "| **Format date** | `DATE_FORMAT(created_at, '%Y-%m-%d')` |\n", - "\n", - "**Key Points:**\n", - "1. **Storage**: Dates stored as Unix timestamps in NUMERIC fields\n", - "2. **Date Literals**: ISO 8601 strings auto-converted to timestamps\n", - "3. **Timezone**: Dates without timezone are treated as UTC\n", - "4. **Month Index**: Redis `MONTH()` returns 0-11, not 1-12" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Async support\n", - "\n", - "SQL queries also work with `AsyncSearchIndex` for async applications:" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "metadata": {}, - "outputs": [], - "source": [ - "from redisvl.index import AsyncSearchIndex\n", - "from redisvl.query import SQLQuery\n", - "\n", - "# Create async index\n", - "async_index = AsyncSearchIndex.from_dict(schema, redis_url=\"redis://localhost:6379\")\n", - "\n", - "# Execute SQL query asynchronously\n", - "sql_query = SQLQuery(f\"SELECT user, age FROM {async_index.name} WHERE age > 30\")\n", - "results = await async_index.query(sql_query)\n", - "\n", - "# Cleanup\n", - "await async_index.disconnect()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Additional Query Examples\n", - "\n", - "The following sections provide more detailed examples for geographic and date queries." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "execution": { - "iopub.execute_input": "2026-02-16T15:20:20.243553Z", - "iopub.status.busy": "2026-02-16T15:20:20.243464Z", - "iopub.status.idle": "2026-02-16T15:20:20.245944Z", - "shell.execute_reply": "2026-02-16T15:20:20.245506Z" - } - }, - "source": [ - "### Native GEO filters\n", - "\n", - "As an alternative to SQL syntax, RedisVL provides native `Geo` and `GeoRadius` filter classes.\n", - "These can be combined with other filters using `&` (AND) and `|` (OR) operators." - ] - }, - { - "cell_type": "code", - "execution_count": 33, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Filter expression: @office_location:[-87.6298 41.8781 100 km]\n", - "\n", - "Users within 100km of Chicago (2 found):\n", - " - mary (doctor) - us-central\n", - " - bill (engineer) - us-central\n" - ] - } - ], - "source": [ - "from redisvl.query import FilterQuery\n", - "from redisvl.query.filter import Geo, GeoRadius, Tag, Num\n", - "\n", - "# Find users within 100km of Chicago using native filters\n", - "geo_filter = Geo(\"office_location\") == GeoRadius(-87.6298, 41.8781, 100, \"km\")\n", - "\n", - "print(f\"Filter expression: {geo_filter}\\n\")\n", - "\n", - "query = FilterQuery(\n", - " filter_expression=geo_filter,\n", - " return_fields=[\"user\", \"job\", \"region\"]\n", - ")\n", - "\n", - "results = index.query(query)\n", - "print(f\"Users within 100km of Chicago ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['user']} ({r['job']}) - {r['region']}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Filter expression: (-@office_location:[-87.6298 41.8781 1000 km])\n", - "\n", - "Users OUTSIDE 1000km of Chicago (3 found):\n", - " - joe (us-east)\n", - " - stacy (us-west)\n", - " - john (us-west)\n" - ] - } - ], - "source": [ - "# Find users OUTSIDE 1000km of Chicago (using !=)\n", - "geo_filter_outside = Geo(\"office_location\") != GeoRadius(-87.6298, 41.8781, 1000, \"km\")\n", - "\n", - "print(f\"Filter expression: {geo_filter_outside}\\n\")\n", - "\n", - "query = FilterQuery(\n", - " filter_expression=geo_filter_outside,\n", - " return_fields=[\"user\", \"region\"]\n", - ")\n", - "\n", - "results = index.query(query)\n", - "print(f\"Users OUTSIDE 1000km of Chicago ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['user']} ({r['region']})\")" - ] - }, - { - "cell_type": "code", - "execution_count": 35, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Combined filter: ((@office_location:[-87.6298 41.8781 500 km] @job:{engineer}) @age:[(40 +inf])\n", - "\n", - "Engineers over 40 within 500km of Chicago (1 found):\n", - " - bill (age: 54) - us-central\n" - ] - } - ], - "source": [ - "# Combine GEO + TAG + NUMERIC filters\n", - "# Find engineers over 40 within 500km of Chicago\n", - "geo_filter = Geo(\"office_location\") == GeoRadius(-87.6298, 41.8781, 500, \"km\")\n", - "job_filter = Tag(\"job\") == \"engineer\"\n", - "age_filter = Num(\"age\") > 40\n", - "\n", - "combined_filter = geo_filter & job_filter & age_filter\n", - "\n", - "print(f\"Combined filter: {combined_filter}\\n\")\n", - "\n", - "query = FilterQuery(\n", - " filter_expression=combined_filter,\n", - " return_fields=[\"user\", \"job\", \"age\", \"region\"]\n", - ")\n", - "\n", - "results = index.query(query)\n", - "print(f\"Engineers over 40 within 500km of Chicago ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['user']} (age: {r['age']}) - {r['region']}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Additional Date Examples\n", - "\n", - "More advanced date query patterns including date function extraction and formatting." - ] - }, - { - "cell_type": "code", - "execution_count": 36, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY year(@created_at) AS year APPLY monthofyear(@created_at) AS month\n", - "\n", - "Events with year/month:\n", - " - Summer Summit | 2024-07\n", - " - Q1 Planning | 2024-01\n", - " - Year End Review 2023 | 2023-12\n", - " - New Year Kickoff | 2024-01\n", - " - Holiday Party 2023 | 2023-12\n", - " - Team Offsite | 2024-03\n", - " - Product Launch | 2024-02\n" - ] - } - ], - "source": [ - "# Extract YEAR and MONTH using date functions in SELECT\n", - "sql_str = \"\"\"\n", - " SELECT name, YEAR(created_at) AS year, MONTH(created_at) AS month\n", - " FROM events\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(f\"\\nEvents with year/month:\")\n", - "for r in results:\n", - " # Note: MONTH returns 0-11 in Redis (0=January)\n", - " month_num = int(r.get('month', 0)) + 1\n", - " print(f\" - {r['name']:25} | {r.get('year')}-{month_num:02d}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY year(@created_at) AS year_created_at FILTER @year_created_at == 2024\n", - "\n", - "Events in 2024 (5 found):\n", - " - Summer Summit\n", - " - Q1 Planning\n", - " - New Year Kickoff\n", - " - Team Offsite\n", - " - Product Launch\n" - ] - } - ], - "source": [ - "# Filter by YEAR using date function in WHERE\n", - "sql_str = \"\"\"\n", - " SELECT name\n", - " FROM events\n", - " WHERE YEAR(created_at) = 2024\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(f\"\\nEvents in 2024 ({len(results)} found):\")\n", - "for r in results:\n", - " print(f\" - {r['name']}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 38, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @year APPLY year(@created_at) AS year GROUPBY 1 @year REDUCE COUNT 0 AS event_count\n", - "\n", - "Events per year:\n", - " 2023: 2 events\n", - " 2024: 5 events\n" - ] - } - ], - "source": [ - "# Count events per year using GROUP BY\n", - "sql_str = \"\"\"\n", - " SELECT YEAR(created_at) AS year, COUNT(*) AS event_count\n", - " FROM events\n", - " GROUP BY year\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(\"\\nEvents per year:\")\n", - "for r in sorted(results, key=lambda x: x.get('year', 0)):\n", - " print(f\" {r['year']}: {r['event_count']} events\")" - ] - }, - { - "cell_type": "code", - "execution_count": 39, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Resulting redis query: FT.AGGREGATE events \"*\" LOAD 2 @created_at @name APPLY timefmt(@created_at, \"%Y-%m-%d\") AS event_date\n", - "\n", - "Events with formatted dates:\n", - " - Summer Summit | 2024-07-15\n", - " - Q1 Planning | 2024-01-15\n", - " - Year End Review 2023 | 2023-12-20\n", - " - New Year Kickoff | 2024-01-01\n", - " - Holiday Party 2023 | 2023-12-15\n", - " - Team Offsite | 2024-03-10\n", - " - Product Launch | 2024-02-20\n" - ] - } - ], - "source": [ - "# Format dates using DATE_FORMAT\n", - "sql_str = \"\"\"\n", - " SELECT name, DATE_FORMAT(created_at, '%Y-%m-%d') AS event_date\n", - " FROM events\n", - "\"\"\"\n", - "\n", - "sql_query = SQLQuery(sql_str)\n", - "redis_query = sql_query.redis_query_string(redis_url=\"redis://localhost:6379\")\n", - "print(\"Resulting redis query: \", redis_query)\n", - "results = events_index.query(sql_query)\n", - "\n", - "print(\"\\nEvents with formatted dates:\")\n", - "for r in results:\n", - " print(f\" - {r['name']:25} | {r.get('event_date', 'N/A')}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Next Steps\n", - "\n", - "Now that you understand SQL queries for Redis, explore these related guides:\n", - "\n", - "- [Use Advanced Query Types](11_advanced_queries.ipynb) - Learn about TextQuery, HybridQuery, and MultiVectorQuery\n", - "- [Query and Filter Data](02_complex_filtering.ipynb) - Apply filters using native RedisVL query syntax\n", - "- [Getting Started](01_getting_started.ipynb) - Review the basics of RedisVL indexes" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Cleanup\n", - "\n", - "To remove all data from Redis associated with the index, use the `.clear()` method. This leaves the index in place for future insertions or updates.\n", - "\n", - "To remove everything including the index, use `.delete()` which removes both the index and the underlying data." - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "metadata": {}, - "outputs": [], - "source": [ - "# Delete both indexes and all associated data\n", - "events_index.delete(drop=True)\n", - "index.delete(drop=True)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "nbformat": 4, + "nbformat_minor": 4 } \ No newline at end of file diff --git a/pyproject.toml b/pyproject.toml index 934ef7f0f..687c44658 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -54,7 +54,7 @@ pillow = [ "pillow>=11.3.0", ] sql-redis = [ - "sql-redis>=0.3.0", + "sql-redis>=0.4.0", ] all = [ "mistralai>=1.0.0", @@ -69,7 +69,7 @@ all = [ "boto3>=1.36.0,<2", "urllib3<2.2.0", "pillow>=11.3.0", - "sql-redis>=0.3.0", + "sql-redis>=0.4.0", ] [project.urls] diff --git a/redisvl/index/index.py b/redisvl/index/index.py index 649aa2156..9d4d48400 100644 --- a/redisvl/index/index.py +++ b/redisvl/index/index.py @@ -121,6 +121,19 @@ ] +def _get_sql_redis_options(sql_query: SQLQuery) -> Dict[str, Any]: + """Return normalized sql-redis executor options for a SQLQuery.""" + return { + "schema_cache_strategy": "lazy", + **getattr(sql_query, "sql_redis_options", {}), + } + + +def _sql_executor_cache_key(sql_redis_options: Dict[str, Any]) -> str: + """Build a stable cache key for sql-redis executor reuse.""" + return json.dumps(sql_redis_options, sort_keys=True, default=repr) + + def process_results( results: "Result", query: BaseQuery, schema: IndexSchema ) -> List[Dict[str, Any]]: @@ -480,6 +493,7 @@ def __init__( self._redis_url = redis_url self._connection_kwargs = connection_kwargs or {} self._lock = threading.Lock() + self._sql_executors: Dict[str, Any] = {} self._validated_client = kwargs.pop("_client_validated", False) self._owns_redis_client = redis_client is None @@ -488,6 +502,7 @@ def __init__( def disconnect(self): """Disconnect from the Redis database.""" + self.invalidate_sql_schema_cache() if self._owns_redis_client is False: logger.info("Index does not own client, not disconnecting") return @@ -587,6 +602,7 @@ def connect(self, redis_url: Optional[str] = None, **kwargs): through the `REDIS_URL` environment variable. ModuleNotFoundError: If required Redis modules are not installed. """ + self.invalidate_sql_schema_cache() self.__redis_client = RedisConnectionFactory.get_redis_connection( redis_url=redis_url, **kwargs ) @@ -607,6 +623,7 @@ def set_client(self, redis_client: SyncRedisClient, **kwargs): TypeError: If the provided client is not valid. """ RedisConnectionFactory.validate_sync_redis(redis_client) + self.invalidate_sql_schema_cache() self.__redis_client = redis_client return self @@ -685,6 +702,7 @@ def create(self, overwrite: bool = False, drop: bool = False) -> None: definition=definition, stopwords=stopwords, ) + self.invalidate_sql_schema_cache() except redis.exceptions.RedisError as e: raise RedisSearchError( f"Failed to create index '{self.name}' on Redis: {str(e)}" @@ -725,6 +743,7 @@ def delete(self, drop: bool = True): self._redis_client.execute_command(*cmd_args, target_nodes=target_nodes) else: self._redis_client.execute_command(*cmd_args) + self.invalidate_sql_schema_cache() except Exception as e: raise RedisSearchError(f"Error while deleting index: {str(e)}") from e @@ -784,8 +803,13 @@ def clear(self) -> int: else: break + self.invalidate_sql_schema_cache() return total_records_deleted + def invalidate_sql_schema_cache(self) -> None: + """Clear cached sql-redis executors and schema state for this index.""" + self._sql_executors.clear() + def drop_keys(self, keys: Union[str, List[str]]) -> int: """Remove a specific entry or entries from the index by it's key ID. @@ -951,18 +975,20 @@ def _sql_query(self, sql_query: SQLQuery) -> List[Dict[str, Any]]: ImportError: If sql-redis package is not installed. """ try: - from sql_redis.executor import Executor - from sql_redis.schema import SchemaRegistry + from sql_redis import create_executor except ImportError: raise ImportError( "sql-redis is required for SQL query support. " "Install it with: pip install redisvl[sql-redis]" ) - registry = SchemaRegistry(self._redis_client) - registry.load_all() # Loads index schemas from Redis - - executor = Executor(self._redis_client, registry) + sql_redis_options = _get_sql_redis_options(sql_query) + cache_key = _sql_executor_cache_key(sql_redis_options) + with self._lock: + executor = self._sql_executors.get(cache_key) + if executor is None: + executor = create_executor(self._redis_client, **sql_redis_options) + self._sql_executors[cache_key] = executor # Execute the query with any params result = executor.execute(sql_query.sql, params=sql_query.params) @@ -1381,6 +1407,7 @@ def __init__( self._redis_url = redis_url self._connection_kwargs = connection_kwargs or {} self._lock = asyncio.Lock() + self._sql_executors: Dict[str, Any] = {} self._validated_client = kwargs.pop("_client_validated", False) self._owns_redis_client = redis_client is None @@ -1445,6 +1472,7 @@ async def connect(self, redis_url: Optional[str] = None, **kwargs): "connect() is deprecated; pass connection parameters in __init__", DeprecationWarning, ) + self.invalidate_sql_schema_cache() client = await RedisConnectionFactory._get_aredis_connection( redis_url=redis_url, **kwargs ) @@ -1457,6 +1485,7 @@ async def set_client(self, redis_client: Union[AsyncRedisClient, SyncRedisClient This method is deprecated; please provide connection parameters in __init__. """ redis_client = await self._validate_client(redis_client) + self.invalidate_sql_schema_cache() await self.disconnect() async with self._lock: self._redis_client = redis_client @@ -1605,6 +1634,7 @@ async def create(self, overwrite: bool = False, drop: bool = False) -> None: definition=definition, stopwords=stopwords, ) + self.invalidate_sql_schema_cache() except redis.exceptions.RedisError as e: raise RedisSearchError( f"Failed to create index '{self.name}' on Redis: {str(e)}" @@ -1645,6 +1675,7 @@ async def delete(self, drop: bool = True): await client.execute_command(*cmd_args, target_nodes=target_nodes) else: await client.execute_command(*cmd_args) + self.invalidate_sql_schema_cache() except Exception as e: raise RedisSearchError(f"Error while deleting index: {str(e)}") from e @@ -1706,8 +1737,13 @@ async def clear(self) -> int: else: break + self.invalidate_sql_schema_cache() return total_records_deleted + def invalidate_sql_schema_cache(self) -> None: + """Clear cached sql-redis executors and schema state for this index.""" + self._sql_executors.clear() + async def drop_keys(self, keys: Union[str, List[str]]) -> int: """Remove a specific entry or entries from the index by it's key ID. @@ -2110,7 +2146,7 @@ async def _sql_query(self, sql_query: SQLQuery) -> List[Dict[str, Any]]: ImportError: If sql-redis package is not installed. """ try: - from sql_redis import AsyncExecutor, AsyncSchemaRegistry + from sql_redis import create_async_executor except ImportError: raise ImportError( "sql-redis is required for SQL query support. " @@ -2118,11 +2154,23 @@ async def _sql_query(self, sql_query: SQLQuery) -> List[Dict[str, Any]]: ) client = await self._get_client() - registry = AsyncSchemaRegistry(client) - await registry.load_all() # Loads index schemas from Redis asynchronously - - executor = AsyncExecutor(client, registry) + sql_redis_options = _get_sql_redis_options(sql_query) + cache_key = _sql_executor_cache_key(sql_redis_options) + + sql_executors_lock = getattr(self, "_sql_executors_lock", None) + if sql_executors_lock is None: + sql_executors_lock = asyncio.Lock() + existing_sql_executors_lock = getattr(self, "_sql_executors_lock", None) + if existing_sql_executors_lock is None: + self._sql_executors_lock = sql_executors_lock + else: + sql_executors_lock = existing_sql_executors_lock + async with sql_executors_lock: + executor = self._sql_executors.get(cache_key) + if executor is None: + executor = await create_async_executor(client, **sql_redis_options) + self._sql_executors[cache_key] = executor # Execute the query with any params asynchronously result = await executor.execute(sql_query.sql, params=sql_query.params) @@ -2250,6 +2298,7 @@ async def info(self, name: Optional[str] = None) -> Dict[str, Any]: return await self._info(index_name, client) async def disconnect(self): + self.invalidate_sql_schema_cache() if self._owns_redis_client is False: return if self._redis_client is not None: diff --git a/redisvl/query/sql.py b/redisvl/query/sql.py index 8cbba5f81..5d3bfe06d 100644 --- a/redisvl/query/sql.py +++ b/redisvl/query/sql.py @@ -10,6 +10,13 @@ class SQLQuery: This class allows users to write SQL SELECT statements that are automatically translated into Redis FT.SEARCH or FT.AGGREGATE commands. + For TEXT fields with ``sql-redis >= 0.4.0``: + + - ``=`` performs exact phrase or exact-term matching + - ``LIKE`` performs wildcard/pattern matching using SQL ``%`` wildcards + - ``fuzzy(field, 'term')`` performs typo-tolerant matching + - ``fulltext(field, 'query')`` performs tokenized text search + .. code-block:: python from redisvl.query import SQLQuery @@ -30,16 +37,37 @@ class SQLQuery: ``pip install redisvl[sql-redis]`` """ - def __init__(self, sql: str, params: Optional[Dict[str, Any]] = None): + def __init__( + self, + sql: str, + params: Optional[Dict[str, Any]] = None, + *, + sql_redis_options: Optional[Dict[str, Any]] = None, + ): """Initialize a SQLQuery. Args: sql: The SQL SELECT statement to execute. params: Optional dictionary of parameters for parameterized queries. Useful for passing vector data for similarity searches. + sql_redis_options: Optional passthrough options forwarded to + ``sql-redis`` executor creation. Use this to tune how SQL + query translation loads and caches index schema metadata. + For example, ``{"schema_cache_strategy": "lazy"}`` loads + schemas on demand (the RedisVL default), while + ``{"schema_cache_strategy": "load_all"}`` eagerly loads + all schemas up front. These options exist to balance startup + cost vs repeated-query performance across many indexes. + + Note: + ``sql-redis >= 0.4.0`` uses explicit TEXT search operators. + Use ``=`` for exact phrase matching, ``LIKE`` for wildcard + matching, ``fuzzy()`` for typo-tolerant matching, and + ``fulltext()`` for tokenized search. """ self.sql = sql self.params = params or {} + self.sql_redis_options = dict(sql_redis_options or {}) def _substitute_params(self, sql: str, params: Dict[str, Any]) -> str: """Substitute parameter placeholders in SQL with actual values. @@ -131,8 +159,7 @@ def redis_query_string( # Output: FT.SEARCH products "@category:{electronics}" """ try: - from sql_redis.schema import SchemaRegistry - from sql_redis.translator import Translator + from sql_redis import create_executor except ImportError: raise ImportError( "sql-redis is required for SQL query support. " @@ -145,15 +172,14 @@ def redis_query_string( redis_client = Redis.from_url(redis_url) - # Load schemas from Redis - registry = SchemaRegistry(redis_client) - registry.load_all() - - # Translate SQL to Redis command - translator = Translator(registry) + sql_redis_options = { + "schema_cache_strategy": "lazy", + **self.sql_redis_options, + } + executor = create_executor(redis_client, **sql_redis_options) # Substitute non-bytes params in SQL before translation sql = self._substitute_params(self.sql, self.params) - translated = translator.translate(sql) + translated = executor._translator.translate(sql) return translated.to_command_string() diff --git a/tests/integration/test_sql_redis_hash.py b/tests/integration/test_sql_redis_hash.py index 245c1bada..8e7a12d1d 100644 --- a/tests/integration/test_sql_redis_hash.py +++ b/tests/integration/test_sql_redis_hash.py @@ -198,6 +198,70 @@ def test_select_specific_fields(self, sql_index): assert "title" in results[0] assert "price" in results[0] + def test_sql_query_defaults_to_lazy_schema_cache(self, sql_index): + """Default SQLQuery execution should only cache the referenced schema.""" + other_index = SearchIndex.from_dict( + { + "index": { + "name": f"sql_aux_{uuid.uuid4().hex[:8]}", + "prefix": f"sql_aux_{uuid.uuid4().hex[:8]}", + "storage_type": "hash", + }, + "fields": [{"name": "name", "type": "text"}], + }, + redis_client=sql_index._redis_client, + ) + other_index.create(overwrite=True) + + try: + sql_index.query(SQLQuery(f"SELECT title FROM {sql_index.name}")) + + assert len(sql_index._sql_executors) == 1 + executor = next(iter(sql_index._sql_executors.values())) + assert sql_index.name in executor._schema_registry._schemas + assert other_index.name not in executor._schema_registry._schemas + finally: + other_index.delete(drop=True) + + def test_sql_query_can_request_load_all_schema_cache(self, sql_index): + """SQLQuery should pass through eager schema cache configuration.""" + other_index = SearchIndex.from_dict( + { + "index": { + "name": f"sql_aux_{uuid.uuid4().hex[:8]}", + "prefix": f"sql_aux_{uuid.uuid4().hex[:8]}", + "storage_type": "hash", + }, + "fields": [{"name": "name", "type": "text"}], + }, + redis_client=sql_index._redis_client, + ) + other_index.create(overwrite=True) + + try: + sql_index.query( + SQLQuery( + f"SELECT title FROM {sql_index.name}", + sql_redis_options={"schema_cache_strategy": "load_all"}, + ) + ) + + executor = next(iter(sql_index._sql_executors.values())) + assert sql_index.name in executor._schema_registry._schemas + assert other_index.name in executor._schema_registry._schemas + finally: + other_index.delete(drop=True) + + def test_clear_invalidates_sql_schema_cache(self, sql_index): + """Index lifecycle operations should clear cached sql-redis executors.""" + sql_index.query(SQLQuery(f"SELECT title FROM {sql_index.name}")) + + assert sql_index._sql_executors + + sql_index.clear() + + assert not sql_index._sql_executors + def test_redis_query_string_with_client(self, sql_index): """Test redis_query_string() with redis_client returns the Redis command string.""" sql_query = SQLQuery( @@ -435,7 +499,7 @@ class TestSQLQueryTextOperators: """Tests for SQL text field operators.""" def test_text_equals(self, sql_index): - """Test text = operator (full-text search).""" + """Test text = operator for single-token TEXT matching.""" sql_query = SQLQuery( f""" SELECT title, name @@ -450,7 +514,7 @@ def test_text_equals(self, sql_index): assert "laptop" in result["title"].lower() def test_text_not_equals(self, sql_index): - """Test text != operator (negated full-text search).""" + """Test text != operator for negated single-token TEXT matching.""" sql_query = SQLQuery( f""" SELECT title, name @@ -466,12 +530,12 @@ def test_text_not_equals(self, sql_index): assert "laptop" not in result["title"].lower() def test_text_prefix(self, sql_index): - """Test text prefix search with wildcard (term*).""" + """Test text prefix search with LIKE pattern matching.""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE title = 'lap*' + WHERE title LIKE 'lap%' """ ) results = sql_index.query(sql_query) @@ -482,12 +546,12 @@ def test_text_prefix(self, sql_index): assert "lap" in result["title"].lower() def test_text_suffix(self, sql_index): - """Test text suffix search with wildcard (*term).""" + """Test text suffix search with LIKE pattern matching.""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE name = '*book' + WHERE name LIKE '%book' """ ) results = sql_index.query(sql_query) @@ -498,12 +562,12 @@ def test_text_suffix(self, sql_index): assert "book" in result["name"].lower() def test_text_fuzzy(self, sql_index): - """Test text fuzzy search with Levenshtein distance (%term%).""" + """Test text fuzzy search with fuzzy(field, value).""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE title = '%laptap%' + WHERE fuzzy(title, 'laptap') """ ) results = sql_index.query(sql_query) @@ -513,6 +577,23 @@ def test_text_fuzzy(self, sql_index): # Should fuzzy match "laptop" even with typo "laptap" assert "laptop" in result["title"].lower() + def test_text_fulltext(self, sql_index): + """Test text tokenized search with fulltext(field, query).""" + sql_query = SQLQuery( + f""" + SELECT title, name + FROM {sql_index.name} + WHERE fulltext(title, 'laptop keyboard') + """ + ) + results = sql_index.query(sql_query) + + assert len(results) >= 1 + for result in results: + title_lower = result["title"].lower() + assert "laptop" in title_lower + assert "keyboard" in title_lower + def test_text_phrase(self, sql_index): """Test text phrase search (multi-word exact phrase).""" sql_query = SQLQuery( @@ -1287,7 +1368,7 @@ def test_geo_distance_combined_with_text(self, geo_index): f""" SELECT name, location FROM {geo_index.name} - WHERE name = 'Downtown' AND geo_distance(location, POINT(-94.5786, 39.0997), 'mi') < 2000 + WHERE name LIKE '%Downtown%' AND geo_distance(location, POINT(-94.5786, 39.0997), 'mi') < 2000 """ ) diff --git a/tests/integration/test_sql_redis_json.py b/tests/integration/test_sql_redis_json.py index c9df30141..d2a10393a 100644 --- a/tests/integration/test_sql_redis_json.py +++ b/tests/integration/test_sql_redis_json.py @@ -444,7 +444,7 @@ class TestSQLQueryTextOperators: """Tests for SQL text field operators.""" def test_text_equals(self, sql_index): - """Test text = operator (full-text search).""" + """Test text = operator for single-token TEXT matching.""" sql_query = SQLQuery( f""" SELECT title, name @@ -459,7 +459,7 @@ def test_text_equals(self, sql_index): assert "laptop" in result["title"].lower() def test_text_not_equals(self, sql_index): - """Test text != operator (negated full-text search).""" + """Test text != operator for negated single-token TEXT matching.""" sql_query = SQLQuery( f""" SELECT title, name @@ -475,12 +475,12 @@ def test_text_not_equals(self, sql_index): assert "laptop" not in result["title"].lower() def test_text_prefix(self, sql_index): - """Test text prefix search with wildcard (term*).""" + """Test text prefix search with LIKE pattern matching.""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE title = 'lap*' + WHERE title LIKE 'lap%' """ ) results = sql_index.query(sql_query) @@ -491,12 +491,12 @@ def test_text_prefix(self, sql_index): assert "lap" in result["title"].lower() def test_text_suffix(self, sql_index): - """Test text suffix search with wildcard (*term).""" + """Test text suffix search with LIKE pattern matching.""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE name = '*book' + WHERE name LIKE '%book' """ ) results = sql_index.query(sql_query) @@ -507,12 +507,12 @@ def test_text_suffix(self, sql_index): assert "book" in result["name"].lower() def test_text_fuzzy(self, sql_index): - """Test text fuzzy search with Levenshtein distance (%term%).""" + """Test text fuzzy search with fuzzy(field, value).""" sql_query = SQLQuery( f""" SELECT title, name FROM {sql_index.name} - WHERE title = '%laptap%' + WHERE fuzzy(title, 'laptap') """ ) results = sql_index.query(sql_query) @@ -522,6 +522,23 @@ def test_text_fuzzy(self, sql_index): # Should fuzzy match "laptop" even with typo "laptap" assert "laptop" in result["title"].lower() + def test_text_fulltext(self, sql_index): + """Test text tokenized search with fulltext(field, query).""" + sql_query = SQLQuery( + f""" + SELECT title, name + FROM {sql_index.name} + WHERE fulltext(title, 'laptop keyboard') + """ + ) + results = sql_index.query(sql_query) + + assert len(results) >= 1 + for result in results: + title_lower = result["title"].lower() + assert "laptop" in title_lower + assert "keyboard" in title_lower + def test_text_phrase(self, sql_index): """Test text phrase search (multi-word exact phrase).""" sql_query = SQLQuery( @@ -1202,6 +1219,81 @@ async def test_async_sql_select(self, async_sql_index): assert "title" in results[0] assert "price" in results[0] + @pytest.mark.asyncio + async def test_async_sql_query_defaults_to_lazy_schema_cache( + self, async_sql_index, redis_url + ): + """Default async SQLQuery execution should cache only the referenced schema.""" + other_index = AsyncSearchIndex.from_dict( + { + "index": { + "name": f"async_sql_aux_{uuid.uuid4().hex[:8]}", + "prefix": f"async_sql_aux_{uuid.uuid4().hex[:8]}", + "storage_type": "json", + }, + "fields": [{"name": "name", "type": "text"}], + }, + redis_url=redis_url, + ) + await other_index.create(overwrite=True) + + try: + await async_sql_index.query( + SQLQuery(f"SELECT title FROM {async_sql_index.name}") + ) + + assert len(async_sql_index._sql_executors) == 1 + executor = next(iter(async_sql_index._sql_executors.values())) + assert async_sql_index.name in executor._schema_registry._schemas + assert other_index.name not in executor._schema_registry._schemas + finally: + await other_index.delete(drop=True) + + @pytest.mark.asyncio + async def test_async_sql_query_can_request_load_all_schema_cache( + self, async_sql_index, redis_url + ): + """Async SQLQuery should pass through eager schema cache configuration.""" + other_index = AsyncSearchIndex.from_dict( + { + "index": { + "name": f"async_sql_aux_{uuid.uuid4().hex[:8]}", + "prefix": f"async_sql_aux_{uuid.uuid4().hex[:8]}", + "storage_type": "json", + }, + "fields": [{"name": "name", "type": "text"}], + }, + redis_url=redis_url, + ) + await other_index.create(overwrite=True) + + try: + await async_sql_index.query( + SQLQuery( + f"SELECT title FROM {async_sql_index.name}", + sql_redis_options={"schema_cache_strategy": "load_all"}, + ) + ) + + executor = next(iter(async_sql_index._sql_executors.values())) + assert async_sql_index.name in executor._schema_registry._schemas + assert other_index.name in executor._schema_registry._schemas + finally: + await other_index.delete(drop=True) + + @pytest.mark.asyncio + async def test_async_clear_invalidates_sql_schema_cache(self, async_sql_index): + """Async lifecycle operations should clear cached sql-redis executors.""" + await async_sql_index.query( + SQLQuery(f"SELECT title FROM {async_sql_index.name}") + ) + + assert async_sql_index._sql_executors + + await async_sql_index.clear() + + assert not async_sql_index._sql_executors + @pytest.mark.asyncio async def test_async_sql_aggregate(self, async_sql_index): """Test async COUNT(*) aggregation (FT.AGGREGATE code path).""" diff --git a/uv.lock b/uv.lock index d7ff82934..b52f1d7d0 100644 --- a/uv.lock +++ b/uv.lock @@ -1,5 +1,5 @@ version = 1 -revision = 3 +revision = 2 requires-python = ">=3.9.2, <3.15" resolution-markers = [ "python_full_version >= '3.14'", @@ -4901,8 +4901,8 @@ requires-dist = [ { name = "redis", specifier = ">=5.0,<8.0" }, { name = "sentence-transformers", marker = "extra == 'all'", specifier = ">=3.4.0,<4" }, { name = "sentence-transformers", marker = "extra == 'sentence-transformers'", specifier = ">=3.4.0,<4" }, - { name = "sql-redis", marker = "extra == 'all'", specifier = ">=0.3.0" }, - { name = "sql-redis", marker = "extra == 'sql-redis'", specifier = ">=0.3.0" }, + { name = "sql-redis", marker = "extra == 'all'", specifier = ">=0.4.0" }, + { name = "sql-redis", marker = "extra == 'sql-redis'", specifier = ">=0.4.0" }, { name = "tenacity", specifier = ">=8.2.2" }, { name = "urllib3", marker = "extra == 'all'", specifier = "<2.2.0" }, { name = "urllib3", marker = "extra == 'bedrock'", specifier = "<2.2.0" }, @@ -5930,16 +5930,16 @@ wheels = [ [[package]] name = "sql-redis" -version = "0.3.0" +version = "0.4.0" source = { registry = "https://pypi.org/simple" } dependencies = [ { name = "redis", version = "7.0.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.10'" }, { name = "redis", version = "7.1.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.10'" }, { name = "sqlglot" }, ] -sdist = { url = "https://files.pythonhosted.org/packages/75/7c/dc77d8fda301cfd9d1937472fbe6555ddce0322f1b4ca0eb18a5d9952b22/sql_redis-0.3.0.tar.gz", hash = "sha256:54e12e690c8a751d1379039d6d24e5b7697ea2283b4693f99fc0221928ff90d9", size = 127554, upload-time = "2026-03-16T17:00:28.784Z" } +sdist = { url = "https://files.pythonhosted.org/packages/70/2f/1a2c8e66af30af2d78375715f813ae62af834277ed806576c43f331b546e/sql_redis-0.4.0.tar.gz", hash = "sha256:c4cae163989a435c1b0b931dbd69f9783fb5593202b2ccdfbd5d62d95ecf425e", size = 158753, upload-time = "2026-04-06T16:26:34.852Z" } wheels = [ - { url = "https://files.pythonhosted.org/packages/8b/18/fbbe5f134cbb6be1901c0bb497e0491fa91c8b3aa4cada5d5c300e575212/sql_redis-0.3.0-py3-none-any.whl", hash = "sha256:e0569a65d50a4ecd79a46eba0a414f625d1edbaeb2f5a2b039ff5aac697b12c6", size = 29757, upload-time = "2026-03-16T17:00:27.5Z" }, + { url = "https://files.pythonhosted.org/packages/2d/6d/f0c5f171dae8154afa3e59e5c219a286e18478f8127eb822cc62da94c560/sql_redis-0.4.0-py3-none-any.whl", hash = "sha256:c395c883f5e6e633178a362b0f1a748923c0816e68e10827ae5e44cad4b21654", size = 41689, upload-time = "2026-04-06T16:26:33.943Z" }, ] [[package]]