From d6d2119298eff2f46d75e82f2c5c478596465e73 Mon Sep 17 00:00:00 2001
From: jiayi-wu_data <jiayi.wu@databricks.com>
Date: Mon, 29 Jun 2026 23:22:07 -0400
Subject: [PATCH] feat(vector-search): migrate to databricks-ai-search SDK, add
 filtering reference

Replace deprecated databricks-sdk WorkspaceClient vector_search_* APIs and
databricks-vectorsearch VectorSearchClient with AISearchClient from the
databricks-ai-search package across all skill files.

Key changes:
- SKILL.md: AISearchClient flat-param API, unified filters= parameter, add
  Installation section and filtering.md reference
- index-types.md: flat create_delta_sync_index/create_direct_access_index,
  index.upsert(list)/index.delete(primary_keys) instead of inputs_json/schema_json
- search-modes.md: index.similarity_search(), lowercase query_type values
  ("hybrid" not "HYBRID"), drop filters_json, add DatabricksReranker section
- end-to-end-rag.md: AISearchClient in agent example, unified filters= syntax
- troubleshooting-and-operations.md: client.get_endpoint/get_index, index.sync/
  describe, add Performance & Capacity section with SLA targets and debug_level
- filtering.md: new file with full operator reference for Standard (dict) and
  Storage-Optimized (SQL string) filter syntax

Co-authored-by: Isaac
---
 .../databricks-vector-search/SKILL.md         | 210 ++++++++----------
 .../end-to-end-rag.md                         |  42 ++--
 .../databricks-vector-search/filtering.md     | 160 +++++++++++++
 .../databricks-vector-search/index-types.md   | 137 +++++-------
 .../databricks-vector-search/search-modes.md  | 110 +++++----
 .../troubleshooting-and-operations.md         | 124 +++++++----
 6 files changed, 476 insertions(+), 307 deletions(-)
 create mode 100644 databricks-skills/databricks-vector-search/filtering.md

diff --git a/databricks-skills/databricks-vector-search/SKILL.md b/databricks-skills/databricks-vector-search/SKILL.md
index 72068ec5..445744b1 100644
--- a/databricks-skills/databricks-vector-search/SKILL.md
+++ b/databricks-skills/databricks-vector-search/SKILL.md
@@ -1,11 +1,11 @@
 ---
 name: databricks-vector-search
-description: "Patterns for Databricks Vector Search: create endpoints and indexes, query with filters, manage embeddings. Use when building RAG applications, semantic search, or similarity matching. Covers both storage-optimized and standard endpoints."
+description: "Patterns for Databricks AI Search (formerly Vector Search): create endpoints and indexes, query with filters, manage embeddings. Use when building RAG applications, semantic search, or similarity matching. Covers both storage-optimized and standard endpoints."
 ---
 
-# Databricks Vector Search
+# Databricks AI Search
 
-Patterns for creating, managing, and querying vector search indexes for RAG and semantic search applications.
+Patterns for creating, managing, and querying AI Search indexes for RAG and semantic search applications. Databricks AI Search was formerly known as Databricks Vector Search.
 
 ## When to Use
 
@@ -18,7 +18,7 @@ Use this skill when:
 
 ## Overview
 
-Databricks Vector Search provides managed vector similarity search with automatic embedding generation and Delta Lake integration.
+Databricks AI Search provides managed vector similarity search with automatic embedding generation and Delta Lake integration.
 
 | Component | Description |
 |-----------|-------------|
@@ -44,56 +44,54 @@ Databricks Vector Search provides managed vector similarity search with automati
 
 ## Quick Start
 
-### Create Endpoint
+### Installation
 
 ```python
-from databricks.sdk import WorkspaceClient
+%pip install databricks-ai-search
+dbutils.library.restartPython()
+from databricks.ai_search.client import AISearchClient
+```
+
+### Create Endpoint
 
-w = WorkspaceClient()
+```python
+client = AISearchClient()
 
-# Create a standard endpoint
-endpoint = w.vector_search_endpoints.create_endpoint(
+client.create_endpoint(
     name="my-vs-endpoint",
     endpoint_type="STANDARD"  # or "STORAGE_OPTIMIZED"
 )
-# Note: Endpoint creation is asynchronous; check status with get_endpoint()
+# Note: Endpoint creation is asynchronous; check status with client.get_endpoint()
 ```
 
 ### Create Delta Sync Index (Managed Embeddings)
 
 ```python
 # Source table must have: primary key column + text column
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.my_index",
+index = client.create_delta_sync_index(
     endpoint_name="my-vs-endpoint",
+    source_table_name="catalog.schema.documents",
+    index_name="catalog.schema.my_index",
+    pipeline_type="TRIGGERED",  # or "CONTINUOUS"
     primary_key="id",
-    index_type="DELTA_SYNC",
-    delta_sync_index_spec={
-        "source_table": "catalog.schema.documents",
-        "embedding_source_columns": [
-            {
-                "name": "content",  # Text column to embed
-                "embedding_model_endpoint_name": "databricks-gte-large-en"
-            }
-        ],
-        "pipeline_type": "TRIGGERED"  # or "CONTINUOUS"
-    }
+    embedding_source_column="content",
+    embedding_model_endpoint_name="databricks-gte-large-en"
 )
 ```
 
 ### Query Index
 
 ```python
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content", "metadata"],
+index = client.get_index(
+    endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.my_index"
+)
+
+results = index.similarity_search(
     query_text="What is machine learning?",
+    columns=["id", "content", "metadata"],
     num_results=5
 )
-
-for doc in results.result.data_array:
-    score = doc[-1]  # Similarity score is last column
-    print(f"Score: {score}, Content: {doc[1][:100]}...")
 ```
 
 ## Common Patterns
@@ -102,7 +100,7 @@ for doc in results.result.data_array:
 
 ```python
 # For large-scale, cost-effective deployments
-endpoint = w.vector_search_endpoints.create_endpoint(
+client.create_endpoint(
     name="my-storage-endpoint",
     endpoint_type="STORAGE_OPTIMIZED"
 )
@@ -112,72 +110,57 @@ endpoint = w.vector_search_endpoints.create_endpoint(
 
 ```python
 # Source table must have: primary key + embedding vector column
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.my_index",
+index = client.create_delta_sync_index(
     endpoint_name="my-vs-endpoint",
+    source_table_name="catalog.schema.documents",
+    index_name="catalog.schema.my_index",
+    pipeline_type="TRIGGERED",
     primary_key="id",
-    index_type="DELTA_SYNC",
-    delta_sync_index_spec={
-        "source_table": "catalog.schema.documents",
-        "embedding_vector_columns": [
-            {
-                "name": "embedding",  # Pre-computed embedding column
-                "embedding_dimension": 768
-            }
-        ],
-        "pipeline_type": "TRIGGERED"
-    }
+    embedding_dimension=768,
+    embedding_vector_column="embedding"
 )
 ```
 
 ### Direct Access Index
 
 ```python
-import json
-
 # Create index for manual CRUD
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.direct_index",
+index = client.create_direct_access_index(
     endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.direct_index",
     primary_key="id",
-    index_type="DIRECT_ACCESS",
-    direct_access_index_spec={
-        "embedding_vector_columns": [
-            {"name": "embedding", "embedding_dimension": 768}
-        ],
-        "schema_json": json.dumps({
-            "id": "string",
-            "text": "string",
-            "embedding": "array<float>",
-            "metadata": "string"
-        })
+    embedding_dimension=768,
+    embedding_vector_column="embedding",
+    schema={
+        "id": "string",
+        "text": "string",
+        "embedding": "array<float>",
+        "metadata": "string"
     }
 )
 
 # Upsert data
-w.vector_search_indexes.upsert_data_vector_index(
-    index_name="catalog.schema.direct_index",
-    inputs_json=json.dumps([
-        {"id": "1", "text": "Hello", "embedding": [0.1, 0.2, ...], "metadata": "doc1"},
-        {"id": "2", "text": "World", "embedding": [0.3, 0.4, ...], "metadata": "doc2"},
-    ])
-)
+index.upsert([
+    {"id": "1", "text": "Hello", "embedding": [0.1, 0.2, ...], "metadata": "doc1"},
+    {"id": "2", "text": "World", "embedding": [0.3, 0.4, ...], "metadata": "doc2"},
+])
 
 # Delete data
-w.vector_search_indexes.delete_data_vector_index(
-    index_name="catalog.schema.direct_index",
-    primary_keys=["1", "2"]
-)
+index.delete(primary_keys=["1", "2"])
 ```
 
 ### Query with Embedding Vector
 
 ```python
-# When you have pre-computed query embedding
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "text"],
+index = client.get_index(
+    endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.my_index"
+)
+
+# When you have a pre-computed query embedding
+results = index.similarity_search(
     query_vector=[0.1, 0.2, 0.3, ...],  # Your 768-dim vector
+    columns=["id", "text"],
     num_results=10
 )
 ```
@@ -188,11 +171,10 @@ Hybrid search combines vector similarity (ANN) with BM25 keyword scoring. Use it
 
 ```python
 # Combines vector similarity with keyword matching
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+results = index.similarity_search(
     query_text="SPARK-12345 executor memory error",
-    query_type="HYBRID",
+    query_type="hybrid",
+    columns=["id", "content"],
     num_results=10
 )
 ```
@@ -202,57 +184,53 @@ results = w.vector_search_indexes.query_index(
 ### Standard Endpoint Filters (Dictionary)
 
 ```python
-# filters_json uses dictionary format
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+# filters accepts a dict for standard endpoints
+results = index.similarity_search(
     query_text="machine learning",
+    columns=["id", "content"],
     num_results=10,
-    filters_json='{"category": "ai", "status": ["active", "pending"]}'
+    filters={"category": "ai", "status": ["active", "pending"]}
 )
 ```
 
 ### Storage-Optimized Filters (SQL-like)
 
-Storage-Optimized endpoints use SQL-like filter syntax via the `databricks-vectorsearch` package's `filters` parameter (accepts a string):
+Storage-Optimized endpoints use SQL-like filter syntax passed as a string to the `filters` parameter:
 
 ```python
-from databricks.vector_search.client import VectorSearchClient
-
-vsc = VectorSearchClient()
-index = vsc.get_index(endpoint_name="my-storage-endpoint", index_name="catalog.schema.my_index")
+index = client.get_index(
+    endpoint_name="my-storage-endpoint",
+    index_name="catalog.schema.my_index"
+)
 
-# SQL-like filter syntax for storage-optimized endpoints
 results = index.similarity_search(
     query_text="machine learning",
     columns=["id", "content"],
     num_results=10,
     filters="category = 'ai' AND status IN ('active', 'pending')"
 )
-
-# More filter examples
-# filters="price > 100 AND price < 500"
-# filters="department LIKE 'eng%'"
-# filters="created_at >= '2024-01-01'"
 ```
 
+See [filtering.md](filtering.md) for a full reference of operators, data types, and limitations per endpoint type.
+
 ### Trigger Index Sync
 
 ```python
-# For TRIGGERED pipeline type, manually sync
-w.vector_search_indexes.sync_index(
+index = client.get_index(
+    endpoint_name="my-vs-endpoint",
     index_name="catalog.schema.my_index"
 )
+index.sync()
 ```
 
 ### Scan All Index Entries
 
 ```python
-# Retrieve all vectors (for debugging/export)
-scan_result = w.vector_search_indexes.scan_index(
-    index_name="catalog.schema.my_index",
-    num_results=100
+index = client.get_index(
+    endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.my_index"
 )
+scan_result = index.scan(num_results=100)
 ```
 
 ## Reference Files
@@ -261,7 +239,8 @@ scan_result = w.vector_search_indexes.scan_index(
 |-------|------|-------------|
 | Index Types | [index-types.md](index-types.md) | Detailed comparison of Delta Sync (managed/self-managed) vs Direct Access |
 | End-to-End RAG | [end-to-end-rag.md](end-to-end-rag.md) | Complete walkthrough: source table → endpoint → index → query → agent integration |
-| Search Modes | [search-modes.md](search-modes.md) | When to use semantic (ANN) vs hybrid search, decision guide |
+| Search Modes | [search-modes.md](search-modes.md) | When to use semantic (ANN) vs hybrid search, reranker, decision guide |
+| Filtering | [filtering.md](filtering.md) | Filter operators by data type for Standard and Storage-Optimized endpoints |
 | Operations | [troubleshooting-and-operations.md](troubleshooting-and-operations.md) | Monitoring, cost optimization, capacity planning, migration |
 
 ## CLI Quick Reference
@@ -298,9 +277,9 @@ databricks vector-search indexes delete-index \
 |-------|----------|
 | **Index sync slow** | Use Storage-Optimized endpoints (20x faster indexing) |
 | **Query latency high** | Use Standard endpoint for <100ms latency |
-| **filters_json not working** | Storage-Optimized uses SQL-like string filters via `databricks-vectorsearch` package's `filters` parameter |
+| **Filters not working** | Standard endpoints use a dict: `filters={"col": "val"}`. Storage-Optimized use a SQL string: `filters="col = 'val'"`. See [filtering.md](filtering.md) |
 | **Embedding dimension mismatch** | Ensure query and index dimensions match |
-| **Index not updating** | Check pipeline_type; use sync_index() for TRIGGERED |
+| **Index not updating** | Check pipeline_type; call `index.sync()` for TRIGGERED |
 | **Out of capacity** | Upgrade to Storage-Optimized (1B+ vectors) |
 | **`query_vector` truncated by MCP tool** | MCP tool calls serialize arrays as JSON and can truncate large vectors (e.g. 1024-dim). Use `query_text` instead (for managed embedding indexes), or use the Databricks SDK/CLI to pass raw vectors |
 
@@ -315,17 +294,16 @@ Databricks provides built-in embedding models:
 
 ```python
 # Use with managed embeddings
-embedding_source_columns=[
-    {
-        "name": "content",
-        "embedding_model_endpoint_name": "databricks-gte-large-en"
-    }
-]
+index = client.create_delta_sync_index(
+    ...
+    embedding_source_column="content",
+    embedding_model_endpoint_name="databricks-gte-large-en"
+)
 ```
 
 ## MCP Tools
 
-The following MCP tools are available for managing Vector Search infrastructure. For a full end-to-end walkthrough, see [end-to-end-rag.md](end-to-end-rag.md).
+The following MCP tools are available for managing AI Search infrastructure. For a full end-to-end walkthrough, see [end-to-end-rag.md](end-to-end-rag.md).
 
 ### manage_vs_endpoint - Endpoint Management
 
@@ -384,7 +362,7 @@ all_indexes = manage_vs_index(action="list")
 
 ### query_vs_index - Query (Hot Path)
 
-Query index with `query_text`, `query_vector`, or hybrid (`query_type="HYBRID"`). Prefer `query_text` over `query_vector` — MCP tool calls can truncate large embedding arrays (1024-dim).
+Query index with `query_text`, `query_vector`, or hybrid (`query_type="hybrid"`). Prefer `query_text` over `query_vector` — MCP tool calls can truncate large embedding arrays (1024-dim).
 
 ```python
 # Query an index
@@ -400,7 +378,7 @@ results = query_vs_index(
     index_name="catalog.schema.my_index",
     columns=["id", "content"],
     query_text="SPARK-12345 memory error",
-    query_type="HYBRID",
+    query_type="hybrid",
     num_results=10
 )
 ```
@@ -435,13 +413,13 @@ manage_vs_data(action="scan", index_name="catalog.schema.my_index", num_results=
 - **Delta Sync recommended** — easier than Direct Access for most scenarios
 - **Hybrid search** — available for both Delta Sync and Direct Access indexes
 - **`columns_to_sync` matters** — only synced columns are available in query results; include all columns you need
-- **Filter syntax differs by endpoint** — Standard uses dict-format filters, Storage-Optimized uses SQL-like string filters. Use the `databricks-vectorsearch` package's `filters` parameter which accepts both formats
-- **Management vs runtime** — MCP tools above handle lifecycle management; for agent tool-calling at runtime, use `VectorSearchRetrieverTool` or the Databricks managed Vector Search MCP server
+- **Filter syntax differs by endpoint** — Standard uses dict-format `filters`, Storage-Optimized uses SQL-like string `filters`. See [filtering.md](filtering.md)
+- **Management vs runtime** — MCP tools above handle lifecycle management; for agent tool-calling at runtime, use `VectorSearchRetrieverTool` or the Databricks managed AI Search MCP server
 
 ## Related Skills
 
 - **[databricks-model-serving](../databricks-model-serving/SKILL.md)** - Deploy agents that use VectorSearchRetrieverTool
 - **[databricks-agent-bricks](../databricks-agent-bricks/SKILL.md)** - Knowledge Assistants use RAG over indexed documents
-- **[databricks-unstructured-pdf-generation](../databricks-unstructured-pdf-generation/SKILL.md)** - Generate documents to index in Vector Search
+- **[databricks-unstructured-pdf-generation](../databricks-unstructured-pdf-generation/SKILL.md)** - Generate documents to index in AI Search
 - **[databricks-unity-catalog](../databricks-unity-catalog/SKILL.md)** - Manage the catalogs and tables that back Delta Sync indexes
-- **[databricks-spark-declarative-pipelines](../databricks-spark-declarative-pipelines/SKILL.md)** - Build Delta tables used as Vector Search sources
+- **[databricks-spark-declarative-pipelines](../databricks-spark-declarative-pipelines/SKILL.md)** - Build Delta tables used as AI Search sources
diff --git a/databricks-skills/databricks-vector-search/end-to-end-rag.md b/databricks-skills/databricks-vector-search/end-to-end-rag.md
index a3808d1b..4bd9913f 100644
--- a/databricks-skills/databricks-vector-search/end-to-end-rag.md
+++ b/databricks-skills/databricks-vector-search/end-to-end-rag.md
@@ -1,4 +1,4 @@
-# End-to-End RAG with Vector Search
+# End-to-End RAG with AI Search
 
 Build a complete Retrieval-Augmented Generation pipeline: prepare documents, create a vector index, query it, and wire it into an agent.
 
@@ -48,7 +48,7 @@ execute_sql(sql_query="""
 """)
 ```
 
-## Step 2: Create Vector Search Endpoint
+## Step 2: Create AI Search Endpoint
 
 ```python
 manage_vs_endpoint(
@@ -121,7 +121,7 @@ query_vs_index(
 The filter syntax depends on the endpoint type used when creating the index.
 
 ```python
-# Storage-Optimized endpoint (used in this walkthrough): SQL-like filter syntax
+# Storage-Optimized endpoint (used in this walkthrough): SQL-like string
 query_vs_index(
     index_name="catalog.schema.knowledge_base_index",
     columns=["doc_id", "title", "content"],
@@ -130,13 +130,13 @@ query_vs_index(
     filters="category = 'governance'"
 )
 
-# Standard endpoint (if you created a Standard endpoint instead): JSON filters_json
+# Standard endpoint: dict-format filters
 query_vs_index(
     index_name="catalog.schema.my_standard_index",
     columns=["doc_id", "title", "content"],
     query_text="How do I govern my data?",
     num_results=3,
-    filters_json='{"category": "governance"}'
+    filters={"category": "governance"}
 )
 ```
 
@@ -148,7 +148,7 @@ query_vs_index(
     columns=["doc_id", "title", "content"],
     query_text="Delta Lake ACID transactions",
     num_results=5,
-    query_type="HYBRID"
+    query_type="hybrid"
 )
 ```
 
@@ -158,43 +158,35 @@ query_vs_index(
 
 ### As a Tool in a ChatAgent
 
-Use `VectorSearchRetrieverTool` to wire the index into an agent deployed on Model Serving:
+Use `AISearchClient` to wire the index into an agent deployed on Model Serving:
 
 ```python
 from databricks.agents import ChatAgent
-from databricks.agents.tools import VectorSearchRetrieverTool
+from databricks.ai_search.client import AISearchClient
 from databricks.sdk import WorkspaceClient
 
-# Define the retriever tool
-retriever_tool = VectorSearchRetrieverTool(
-    index_name="catalog.schema.knowledge_base_index",
-    columns=["doc_id", "title", "content"],
-    num_results=3,
-)
-
 class RAGAgent(ChatAgent):
     def __init__(self):
+        self.search_client = AISearchClient()
         self.w = WorkspaceClient()
 
     def predict(self, messages, context=None):
         query = messages[-1].content
 
-        results = self.w.vector_search_indexes.query_index(
-            index_name="catalog.schema.knowledge_base_index",
-            columns=["title", "content"],
+        index = self.search_client.get_index(
+            endpoint_name="my-rag-endpoint",
+            index_name="catalog.schema.knowledge_base_index"
+        )
+        results = index.similarity_search(
             query_text=query,
+            columns=["title", "content"],
             num_results=3,
         )
 
-        context_docs = "\n\n".join(
-            f"**{row[0]}**: {row[1]}"
-            for row in results.result.data_array
-        )
-
         response = self.w.serving_endpoints.query(
             name="databricks-meta-llama-3-3-70b-instruct",
             messages=[
-                {"role": "system", "content": f"Answer using this context:\n{context_docs}"},
+                {"role": "system", "content": "Answer using the retrieved context."},
                 {"role": "user", "content": query},
             ],
         )
@@ -238,4 +230,4 @@ Then sync — the index automatically handles deletions via Delta change data fe
 | **"Column not found in index"** | Column must be in `columns_to_sync`. Recreate index with the column included |
 | **Embeddings not computed** | Ensure `embedding_model_endpoint_name` is a valid serving endpoint |
 | **Stale results after table update** | For TRIGGERED pipelines, you must call `manage_vs_index(action="sync")` manually |
-| **Filter not working** | Standard endpoints use dict-format filters (`filters_json`), Storage-Optimized use SQL-like string filters (`filters`) |
+| **Filters not working** | Standard endpoints use dict-format `filters`; Storage-Optimized use SQL-like string `filters`. See [filtering.md](filtering.md) |
diff --git a/databricks-skills/databricks-vector-search/filtering.md b/databricks-skills/databricks-vector-search/filtering.md
new file mode 100644
index 00000000..ee698e78
--- /dev/null
+++ b/databricks-skills/databricks-vector-search/filtering.md
@@ -0,0 +1,160 @@
+# AI Search Filtering Reference
+
+Filter syntax for Databricks AI Search. Standard endpoints use a Python dict; Storage-Optimized endpoints use a SQL-like string. Both are passed to the `filters` parameter of `index.similarity_search()`.
+
+## Quick Reference
+
+| Operation | Standard (dict) | Storage-Optimized (SQL string) |
+|-----------|----------------|-------------------------------|
+| Exact match | `{"col": "val"}` | `col = 'val'` |
+| Negation | `{"col NOT": "val"}` | `col != 'val'` |
+| Match any in list | `{"col": ["v1", "v2"]}` | `col IN ('v1', 'v2')` |
+| Greater than | `{"col >": 100}` | `col > 100` |
+| Less than | `{"col <": 100}` | `col < 100` |
+| Range | `{"col >=": 10, "col <=": 50}` | `col >= 10 AND col <= 50` |
+| AND | Multiple dict keys | `col1 = 'v' AND col2 > 10` |
+| Cross-field OR | `{"col1 OR col2 <=": ["v1", v2]}` | `col1 = 'v' OR col2 > 10` |
+| LIKE | `{"col LIKE": "token"}` | `col LIKE 'val%'` |
+| Boolean | `{"col": True}` | `col IS TRUE` |
+| Timestamp | `{"col >": "2024-01-01T00:00:00Z"}` | `col > TO_TIMESTAMP('2024-01-01T00:00:00')` |
+
+---
+
+## Standard Endpoints (Dictionary Syntax)
+
+### String columns
+
+```python
+# Exact match
+filters={"make": "Toyota"}
+
+# Negation
+filters={"make NOT": "Ford"}
+
+# Match any value (OR logic within one column)
+filters={"make": ["Toyota", "Honda"]}
+
+# Token-based LIKE — matches whole tokens only (not SQL wildcards)
+filters={"description LIKE": "hybrid"}
+
+# JSON field pattern using list of dicts for multiple conditions
+filters=[{"specs LIKE": '%"drivetrain":"AWD"%'}, {"price <": 50000}]
+```
+
+### Numeric columns
+
+```python
+# Comparison operators
+filters={"price >": 40000}
+filters={"price <=": 55000}
+
+# Range (AND logic via multiple keys)
+filters={"price >=": 30000, "price <=": 55000}
+
+# Integer exact match
+filters={"year": 2024}
+```
+
+### Boolean columns
+
+```python
+filters={"in_stock": True}
+filters={"in_stock": False}
+```
+
+### Timestamp columns
+
+```python
+# After a date
+filters={"listed_at >": "2024-01-01T00:00:00Z"}
+
+# Date range
+filters={"listed_at >=": "2024-01-01T00:00:00Z", "listed_at <": "2024-04-01T00:00:00Z"}
+```
+
+### Array columns
+
+Array columns support primitive types (e.g. `ARRAY<STRING>`, `ARRAY<INT>`). `ARRAY<STRUCT>` is not supported.
+
+```python
+# Contains a value
+filters={"body_type": "sedan"}
+
+# Contains any of these values
+filters={"body_type": ["hybrid", "electric"]}
+```
+
+### AND and OR logic
+
+```python
+# AND: use multiple keys in the dict
+filters={"make": "BMW", "price >": 60000}
+
+# Cross-field OR
+filters={"make OR price <=": ["Tesla", 30000]}
+```
+
+### Standard endpoint limitations
+
+- **LIKE is token-based only** — matches whole tokens, not SQL wildcards. `{"col LIKE": "hybrid"}` matches documents containing the token "hybrid"; `%` patterns are not supported.
+- **No BETWEEN** — use two keys instead: `{"price >=": 30000, "price <=": 55000}`
+- **No SQL functions** in filter values
+- **No nested JSON value extraction**
+- **Duplicate dict keys are silently dropped** — use a list of dicts for multiple conditions on the same column
+
+---
+
+## Storage-Optimized Endpoints (SQL String Syntax)
+
+### String columns
+
+```python
+# Exact match
+filters="make = 'Toyota'"
+
+# Negation
+filters="make != 'Ford'"
+
+# IN list
+filters="make IN ('Toyota', 'Honda')"
+
+# LIKE with SQL wildcards
+filters="color LIKE 'bl%'"
+```
+
+### Numeric columns
+
+```python
+filters="price > 40000"
+filters="price >= 30000 AND price <= 55000"
+```
+
+### Boolean columns
+
+```python
+filters="in_stock IS TRUE"
+```
+
+### Timestamp columns
+
+Timestamp values must be wrapped in `TO_TIMESTAMP()`:
+
+```python
+filters="listed_at > TO_TIMESTAMP('2024-03-01T00:00:00')"
+```
+
+### AND and OR logic
+
+```python
+# AND
+filters="make = 'BMW' AND price > 60000"
+
+# OR with grouping
+filters="make = 'Tesla' OR (make = 'BMW' AND price > 60000)"
+```
+
+### Storage-Optimized endpoint limitations
+
+- **Array columns not supported** — `ARRAY_CONTAINS` is not available. Workaround: concatenate array values into a string column and use `LIKE`.
+- **Timestamps require `TO_TIMESTAMP()`** — bare string dates are not accepted.
+- **No JSON function extraction** in filter expressions.
diff --git a/databricks-skills/databricks-vector-search/index-types.md b/databricks-skills/databricks-vector-search/index-types.md
index ebfc1c7e..df603e0f 100644
--- a/databricks-skills/databricks-vector-search/index-types.md
+++ b/databricks-skills/databricks-vector-search/index-types.md
@@ -1,4 +1,4 @@
-# Vector Search Index Types
+# AI Search Index Types
 
 ## Comparison Matrix
 
@@ -24,26 +24,19 @@ Databricks automatically computes embeddings from your text column.
 ### Create Index
 
 ```python
-from databricks.sdk import WorkspaceClient
+from databricks.ai_search.client import AISearchClient
 
-w = WorkspaceClient()
+client = AISearchClient()
 
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.docs_index",
+index = client.create_delta_sync_index(
     endpoint_name="my-vs-endpoint",
+    source_table_name="catalog.schema.documents",
+    index_name="catalog.schema.docs_index",
+    pipeline_type="TRIGGERED",  # or "CONTINUOUS"
     primary_key="doc_id",
-    index_type="DELTA_SYNC",
-    delta_sync_index_spec={
-        "source_table": "catalog.schema.documents",
-        "embedding_source_columns": [
-            {
-                "name": "content",
-                "embedding_model_endpoint_name": "databricks-gte-large-en"
-            }
-        ],
-        "pipeline_type": "TRIGGERED",  # or "CONTINUOUS"
-        "columns_to_sync": ["doc_id", "content", "title", "category"]
-    }
+    embedding_source_column="content",
+    embedding_model_endpoint_name="databricks-gte-large-en",
+    columns_to_sync=["doc_id", "content", "title", "category"]
 )
 ```
 
@@ -51,7 +44,7 @@ index = w.vector_search_indexes.create_index(
 
 | Type | Behavior | Cost | Use Case |
 |------|----------|------|----------|
-| `TRIGGERED` | Manual sync via API | Lower | Batch updates |
+| `TRIGGERED` | Manual sync via `index.sync()` | Lower | Batch updates |
 | `CONTINUOUS` | Auto-sync on changes | Higher | Real-time sync |
 
 ### Source Table Example
@@ -79,21 +72,14 @@ You pre-compute embeddings and store them in the source table.
 ### Create Index
 
 ```python
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.custom_index",
+index = client.create_delta_sync_index(
     endpoint_name="my-vs-endpoint",
+    source_table_name="catalog.schema.embedded_docs",
+    index_name="catalog.schema.custom_index",
+    pipeline_type="TRIGGERED",
     primary_key="id",
-    index_type="DELTA_SYNC",
-    delta_sync_index_spec={
-        "source_table": "catalog.schema.embedded_docs",
-        "embedding_vector_columns": [
-            {
-                "name": "embedding",
-                "embedding_dimension": 768
-            }
-        ],
-        "pipeline_type": "TRIGGERED"
-    }
+    embedding_dimension=768,
+    embedding_vector_column="embedding"
 )
 ```
 
@@ -146,24 +132,22 @@ Full control over vector data via CRUD API. No Delta table sync.
 ### Create Index
 
 ```python
-import json
+from databricks.ai_search.client import AISearchClient
 
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.realtime_index",
+client = AISearchClient()
+
+index = client.create_direct_access_index(
     endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.realtime_index",
     primary_key="id",
-    index_type="DIRECT_ACCESS",
-    direct_access_index_spec={
-        "embedding_vector_columns": [
-            {"name": "embedding", "embedding_dimension": 768}
-        ],
-        "schema_json": json.dumps({
-            "id": "string",
-            "text": "string",
-            "embedding": "array<float>",
-            "category": "string",
-            "score": "float"
-        })
+    embedding_dimension=768,
+    embedding_vector_column="embedding",
+    schema={
+        "id": "string",
+        "text": "string",
+        "embedding": "array<float>",
+        "category": "string",
+        "score": "float"
     }
 )
 ```
@@ -171,57 +155,44 @@ index = w.vector_search_indexes.create_index(
 ### Upsert Data
 
 ```python
-import json
-
 # Insert or update vectors
-w.vector_search_indexes.upsert_data_vector_index(
-    index_name="catalog.schema.realtime_index",
-    inputs_json=json.dumps([
-        {
-            "id": "doc-001",
-            "text": "Machine learning basics",
-            "embedding": [0.1, 0.2, 0.3, ...],  # 768 floats
-            "category": "ml",
-            "score": 0.95
-        },
-        {
-            "id": "doc-002",
-            "text": "Deep learning overview",
-            "embedding": [0.4, 0.5, 0.6, ...],
-            "category": "dl",
-            "score": 0.88
-        }
-    ])
-)
+index.upsert([
+    {
+        "id": "doc-001",
+        "text": "Machine learning basics",
+        "embedding": [0.1, 0.2, 0.3, ...],  # 768 floats
+        "category": "ml",
+        "score": 0.95
+    },
+    {
+        "id": "doc-002",
+        "text": "Deep learning overview",
+        "embedding": [0.4, 0.5, 0.6, ...],
+        "category": "dl",
+        "score": 0.88
+    }
+])
 ```
 
 ### Delete Data
 
 ```python
-w.vector_search_indexes.delete_data_vector_index(
-    index_name="catalog.schema.realtime_index",
-    primary_keys=["doc-001", "doc-002"]
-)
+index.delete(primary_keys=["doc-001", "doc-002"])
 ```
 
 ### Attach Embedding Model (Optional)
 
-For Direct Access with text queries:
+For Direct Access indexes that need to support `query_text` (rather than `query_vector`), specify an embedding model at creation time:
 
 ```python
-# Create index with embedding model for query-time embedding
-index = w.vector_search_indexes.create_index(
-    name="catalog.schema.hybrid_index",
+index = client.create_direct_access_index(
     endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.hybrid_index",
     primary_key="id",
-    index_type="DIRECT_ACCESS",
-    direct_access_index_spec={
-        "embedding_vector_columns": [
-            {"name": "embedding", "embedding_dimension": 768}
-        ],
-        "embedding_model_endpoint_name": "databricks-gte-large-en",  # For query_text
-        "schema_json": json.dumps({...})
-    }
+    embedding_dimension=768,
+    embedding_vector_column="embedding",
+    embedding_model_endpoint_name="databricks-gte-large-en",  # Enables query_text
+    schema={...}
 )
 ```
 
diff --git a/databricks-skills/databricks-vector-search/search-modes.md b/databricks-skills/databricks-vector-search/search-modes.md
index 58092afa..2121d86b 100644
--- a/databricks-skills/databricks-vector-search/search-modes.md
+++ b/databricks-skills/databricks-vector-search/search-modes.md
@@ -1,6 +1,6 @@
-# Vector Search Modes
+# AI Search Modes
 
-Databricks Vector Search supports three search modes: **ANN** (semantic, default), **HYBRID** (semantic + keyword), and **FULL_TEXT** (keyword only, beta). ANN and HYBRID work with Delta Sync and Direct Access indexes.
+Databricks AI Search supports three search modes: **ANN** (semantic, default), **hybrid** (semantic + keyword), and **FULL_TEXT** (keyword only, beta). ANN and hybrid work with Delta Sync and Direct Access indexes.
 
 ## Semantic Search (ANN)
 
@@ -16,11 +16,15 @@ ANN (Approximate Nearest Neighbor) is the default search mode. It finds document
 ### Example
 
 ```python
+from databricks.ai_search.client import AISearchClient
+
+client = AISearchClient()
+index = client.get_index(endpoint_name="my-endpoint", index_name="catalog.schema.my_index")
+
 # ANN is the default — no query_type parameter needed
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+results = index.similarity_search(
     query_text="How do I handle errors in my pipeline?",
+    columns=["id", "content"],
     num_results=5
 )
 ```
@@ -39,11 +43,10 @@ Hybrid search combines vector similarity (ANN) with BM25 keyword scoring. It ret
 ### Example
 
 ```python
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+results = index.similarity_search(
     query_text="SPARK-12345 executor memory error",
-    query_type="HYBRID",
+    query_type="hybrid",
+    columns=["id", "content"],
     num_results=10
 )
 ```
@@ -53,43 +56,47 @@ results = w.vector_search_indexes.query_index(
 | Mode | Best for | Trade-off | Choose when |
 |------|----------|-----------|-------------|
 | **ANN** (default) | Conceptual queries, paraphrases, meaning-based search | Fastest; may miss exact keyword matches | You want documents *about* a topic regardless of exact wording |
-| **HYBRID** | Exact terms, codes, proper nouns, mixed-intent queries | ~2x resource usage vs ANN; max 200 results | Your queries contain specific identifiers or technical terms that must appear in results |
+| **hybrid** | Exact terms, codes, proper nouns, mixed-intent queries | ~2x resource usage vs ANN; max 200 results | Your queries contain specific identifiers or technical terms that must appear in results |
 | **FULL_TEXT** (beta) | Pure keyword search without vector embeddings | No semantic understanding; max 200 results | You need keyword matching only, without vector similarity |
 
-**Start with ANN.** Switch to HYBRID if you notice relevant documents being missed because they don't share vocabulary with the query.
+**Start with ANN.** Switch to hybrid if you notice relevant documents being missed because they don't share vocabulary with the query.
 
 ## Combining Search Modes with Filters
 
 Both search modes support filters. The filter syntax depends on your endpoint type:
 
-- **Standard endpoints** → `filters` as dict (or `filters_json` as JSON string via `databricks-sdk`)
-- **Storage-Optimized endpoints** → `filters` as SQL-like string (via `databricks-vectorsearch` package)
+- **Standard endpoints** → `filters` as a dict
+- **Storage-Optimized endpoints** → `filters` as a SQL-like string
+
+See [filtering.md](filtering.md) for the full operator reference.
 
 ### Standard endpoint with hybrid search
 
 ```python
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content", "category"],
+results = index.similarity_search(
     query_text="SPARK-12345 executor memory error",
-    query_type="HYBRID",
+    query_type="hybrid",
+    columns=["id", "content", "category"],
     num_results=10,
-    filters_json='{"category": "troubleshooting", "status": ["open", "in_progress"]}'
+    filters={"category": "troubleshooting", "status": ["open", "in_progress"]}
 )
 ```
 
 ### Storage-Optimized endpoint with hybrid search
 
 ```python
-from databricks.vector_search.client import VectorSearchClient
+from databricks.ai_search.client import AISearchClient
 
-vsc = VectorSearchClient()
-index = vsc.get_index(endpoint_name="my-storage-endpoint", index_name="catalog.schema.my_index")
+client = AISearchClient()
+index = client.get_index(
+    endpoint_name="my-storage-endpoint",
+    index_name="catalog.schema.my_index"
+)
 
 results = index.similarity_search(
     query_text="SPARK-12345 executor memory error",
-    columns=["id", "content", "category"],
     query_type="hybrid",
+    columns=["id", "content", "category"],
     num_results=10,
     filters="category = 'troubleshooting' AND status IN ('open', 'in_progress')"
 )
@@ -101,10 +108,9 @@ If you compute embeddings yourself, use `query_vector` instead of `query_text` f
 
 ```python
 # ANN with pre-computed embedding (default)
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+results = index.similarity_search(
     query_vector=[0.1, 0.2, 0.3, ...],  # Your embedding vector
+    columns=["id", "content"],
     num_results=10
 )
 ```
@@ -112,31 +118,53 @@ results = w.vector_search_indexes.query_index(
 For **hybrid search with self-managed embeddings** (indexes without an associated model endpoint), you must provide **both** `query_vector` and `query_text`. The vector is used for the ANN component and the text for the BM25 keyword component:
 
 ```python
-# HYBRID with self-managed embeddings — requires both vector AND text
-results = w.vector_search_indexes.query_index(
-    index_name="catalog.schema.my_index",
-    columns=["id", "content"],
+# hybrid with self-managed embeddings — requires both vector AND text
+results = index.similarity_search(
     query_vector=[0.1, 0.2, 0.3, ...],  # For ANN similarity
     query_text="executor memory error",   # For BM25 keyword matching
-    query_type="HYBRID",
+    query_type="hybrid",
+    columns=["id", "content"],
     num_results=10
 )
 ```
 
 **Notes:**
 - For **ANN** queries: provide either `query_text` or `query_vector`, not both.
-- For **HYBRID** queries on **managed embedding indexes**: provide only `query_text` (the system handles both components).
-- For **HYBRID** queries on **self-managed indexes without a model endpoint**: provide both `query_vector` and `query_text`.
+- For **hybrid** queries on **managed embedding indexes**: provide only `query_text` (the system handles both components).
+- For **hybrid** queries on **self-managed indexes without a model endpoint**: provide both `query_vector` and `query_text`.
 - When using `query_text` alone, the index must have an associated embedding model (managed embeddings or `embedding_model_endpoint_name` on a Direct Access index).
 
+## Reranker
+
+`DatabricksReranker` improves retrieval quality by re-scoring results after the initial similarity search. Databricks recommends it for RAG use cases where quality matters more than latency.
+
+- **Quality improvement**: ~10%
+- **Latency overhead**: ~1.5 seconds per query
+- **Not recommended** for high-throughput, low-latency applications
+
+```python
+from databricks.ai_search.reranker import DatabricksReranker
+
+results = index.similarity_search(
+    query_text="How to create an AI Search index",
+    columns=["id", "text", "parent_doc_summary", "date"],
+    num_results=10,
+    query_type="hybrid",
+    reranker=DatabricksReranker(columns_to_rerank=["text", "parent_doc_summary"])
+)
+```
+
+`columns_to_rerank`: list of columns used for relevance scoring. Only the first 2,000 characters per column are considered. Set `debug_level=1` to view per-component latency breakdown (`ann_time`, `reranker_time`, `response_time`).
+
 ## Parameter Reference
 
-| Parameter | Type | Package | Description |
-|-----------|------|---------|-------------|
-| `query_text` | `str` | Both | Text query — requires embedding model on the index |
-| `query_vector` | `list[float]` | Both | Pre-computed embedding vector |
-| `query_type` | `str` | Both | `"ANN"` (default) or `"HYBRID"` or `"FULL_TEXT"` (beta) |
-| `columns` | `list[str]` | Both | Column names to return in results |
-| `num_results` | `int` | Both | Number of results (default: 10 in `databricks-sdk`, 5 in `databricks-vectorsearch`) |
-| `filters_json` | `str` | `databricks-sdk` | JSON dict filter string (Standard endpoints) |
-| `filters` | `str` or `dict` | `databricks-vectorsearch` | Dict for Standard, SQL-like string for Storage-Optimized |
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `query_text` | `str` | Text query — requires an embedding model on the index |
+| `query_vector` | `list[float]` | Pre-computed embedding vector |
+| `query_type` | `str` | `"ann"` (default), `"hybrid"`, or `"FULL_TEXT"` (beta) |
+| `columns` | `list[str]` | Column names to return in results |
+| `num_results` | `int` | Number of results (default: 5) |
+| `filters` | `dict` or `str` | Dict for Standard endpoints; SQL-like string for Storage-Optimized. See [filtering.md](filtering.md) |
+| `reranker` | `DatabricksReranker` | Optional reranker for improved quality (~10% gain, ~1.5s overhead) |
+| `debug_level` | `int` | Set to `1` to return per-component latency in the response |
diff --git a/databricks-skills/databricks-vector-search/troubleshooting-and-operations.md b/databricks-skills/databricks-vector-search/troubleshooting-and-operations.md
index 7dc4b8c9..ff289b67 100644
--- a/databricks-skills/databricks-vector-search/troubleshooting-and-operations.md
+++ b/databricks-skills/databricks-vector-search/troubleshooting-and-operations.md
@@ -1,10 +1,10 @@
-# Vector Search Troubleshooting & Operations
+# AI Search Troubleshooting & Operations
 
-Operational guidance for monitoring, cost optimization, capacity planning, and migration of Databricks Vector Search resources.
+Operational guidance for monitoring, cost optimization, capacity planning, and migration of Databricks AI Search resources.
 
 ## Monitoring Endpoint Status
 
-Use `manage_vs_endpoint(action="get")` (MCP tool) or `w.vector_search_endpoints.get_endpoint()` (SDK) to check endpoint health.
+Use `manage_vs_endpoint(action="get")` (MCP tool) or `client.get_endpoint()` (SDK) to check endpoint health.
 
 ### Endpoint fields
 
@@ -20,9 +20,10 @@ Use `manage_vs_endpoint(action="get")` (MCP tool) or `w.vector_search_endpoints.
 ### Example
 
 ```python
-endpoint = w.vector_search_endpoints.get_endpoint(endpoint_name="my-endpoint")
-print(f"State: {endpoint.endpoint_status.state.value}")
-print(f"Indexes: {endpoint.num_indexes}")
+from databricks.ai_search.client import AISearchClient
+
+client = AISearchClient()
+endpoint = client.get_endpoint(name="my-endpoint")
 ```
 
 **What to do per state:**
@@ -34,7 +35,7 @@ print(f"Indexes: {endpoint.num_indexes}")
 
 ## Monitoring Index Status
 
-Use `manage_vs_index(action="get")` (MCP tool) or `w.vector_search_indexes.get_index()` (SDK) to check index health.
+Use `manage_vs_index(action="get")` (MCP tool) or `client.get_index()` (SDK) to check index health.
 
 ### Index fields
 
@@ -50,11 +51,11 @@ Use `manage_vs_index(action="get")` (MCP tool) or `w.vector_search_indexes.get_i
 ### Example
 
 ```python
-index = w.vector_search_indexes.get_index(index_name="catalog.schema.my_index")
-if index.status.ready:
-    print("Index is ONLINE")
-else:
-    print(f"Index is NOT_READY: {index.status.message}")
+index = client.get_index(
+    endpoint_name="my-endpoint",
+    index_name="catalog.schema.my_index"
+)
+index.describe()
 ```
 
 ## Pipeline Type Trade-offs
@@ -63,18 +64,20 @@ Delta Sync indexes use a DLT pipeline to sync data from the source Delta table.
 
 | Pipeline Type | Behavior | Cost | Best for |
 |---------------|----------|------|----------|
-| **TRIGGERED** | Manual sync via `manage_vs_index(action="sync")` | Lower — runs only when triggered | Batch updates, periodic refreshes, cost-sensitive workloads |
+| **TRIGGERED** | Manual sync via `index.sync()` | Lower — runs only when triggered | Batch updates, periodic refreshes, cost-sensitive workloads |
 | **CONTINUOUS** | Auto-syncs on source table changes | Higher — always running | Real-time freshness, applications needing up-to-date results |
 
 ### Triggering a sync
 
 ```python
-# For TRIGGERED pipelines only
-w.vector_search_indexes.sync_index(index_name="catalog.schema.my_index")
-# Check sync progress with get_index()
+index = client.get_index(
+    endpoint_name="my-vs-endpoint",
+    index_name="catalog.schema.my_index"
+)
+index.sync()
 ```
 
-**Tip:** CONTINUOUS pipelines cannot be synced manually — they sync automatically. Calling `sync_index()` on a CONTINUOUS index will raise an error.
+**Tip:** CONTINUOUS pipelines cannot be synced manually — they sync automatically. Calling `index.sync()` on a CONTINUOUS index will raise an error.
 
 ## Cost Optimization
 
@@ -95,15 +98,16 @@ w.vector_search_indexes.sync_index(index_name="catalog.schema.my_index")
 - Choose TRIGGERED pipelines for batch workloads to avoid continuous compute costs.
 
 ```python
-# Only sync the columns you actually need in query results
-delta_sync_index_spec={
-    "source_table": "catalog.schema.documents",
-    "embedding_source_columns": [
-        {"name": "content", "embedding_model_endpoint_name": "databricks-gte-large-en"}
-    ],
-    "pipeline_type": "TRIGGERED",
-    "columns_to_sync": ["id", "content", "title"]  # Exclude large unused columns
-}
+index = client.create_delta_sync_index(
+    endpoint_name="my-vs-endpoint",
+    source_table_name="catalog.schema.documents",
+    index_name="catalog.schema.my_index",
+    pipeline_type="TRIGGERED",
+    primary_key="id",
+    embedding_source_column="content",
+    embedding_model_endpoint_name="databricks-gte-large-en",
+    columns_to_sync=["id", "content", "title"]  # Exclude large unused columns
+)
 ```
 
 ## Capacity Planning
@@ -131,36 +135,71 @@ Endpoints are **immutable after creation** — you cannot change the type (Stand
 5. **Delete old indexes**, then delete the old endpoint
 
 ```python
+from databricks.ai_search.client import AISearchClient
+
+client = AISearchClient()
+
 # Step 1: Create new endpoint
-w.vector_search_endpoints.create_endpoint(
+client.create_endpoint(
     name="my-endpoint-storage-optimized",
     endpoint_type="STORAGE_OPTIMIZED"
 )
 
 # Step 2: Recreate index on new endpoint (same source table)
-w.vector_search_indexes.create_index(
-    name="catalog.schema.my_index_v2",
+client.create_delta_sync_index(
     endpoint_name="my-endpoint-storage-optimized",
+    source_table_name="catalog.schema.documents",
+    index_name="catalog.schema.my_index_v2",
+    pipeline_type="TRIGGERED",
     primary_key="id",
-    index_type="DELTA_SYNC",
-    delta_sync_index_spec={
-        "source_table": "catalog.schema.documents",
-        "embedding_source_columns": [
-            {"name": "content", "embedding_model_endpoint_name": "databricks-gte-large-en"}
-        ],
-        "pipeline_type": "TRIGGERED"
-    }
+    embedding_source_column="content",
+    embedding_model_endpoint_name="databricks-gte-large-en"
 )
 
 # Step 3: Trigger sync and wait for ONLINE state
-w.vector_search_indexes.sync_index(index_name="catalog.schema.my_index_v2")
+index_v2 = client.get_index(
+    endpoint_name="my-endpoint-storage-optimized",
+    index_name="catalog.schema.my_index_v2"
+)
+index_v2.sync()
 
 # Step 4: Update your application to use "catalog.schema.my_index_v2"
 # Step 5: Clean up old resources
-w.vector_search_indexes.delete_index(index_name="catalog.schema.my_index")
-w.vector_search_endpoints.delete_endpoint(endpoint_name="my-endpoint")
+client.delete_index(index_name="catalog.schema.my_index")
+client.delete_endpoint(name="my-endpoint")
 ```
 
+## Performance & Capacity
+
+### Production performance targets
+
+| Metric | Target |
+|--------|--------|
+| P95 latency | < 500ms |
+| P99 latency | < 1 second |
+| Success rate | > 99.5% |
+
+### Endpoint sizing
+
+Operate at ~65% of maximum capacity to preserve headroom for traffic spikes. For example, to sustain 310 RPS, size your endpoint for ~480 RPS maximum capacity.
+
+### Authentication performance
+
+Use OAuth service principals instead of Personal Access Tokens for up to 100ms faster response time and higher request rate limits.
+
+### Debugging latency with component timing
+
+Set `debug_level=1` in `similarity_search()` to return per-component latency:
+
+- `ann_time` — approximate nearest neighbor search duration
+- `embedding_gen_time` — query embedding generation on the model endpoint
+- `reranker_time` — reranking duration (if using `DatabricksReranker`)
+- `response_time` — total end-to-end latency
+
+If `embedding_gen_time` dominates, consider disabling scale-to-zero on your embedding endpoint or increasing its provisioned concurrency.
+
+For full load testing guidance, see the [AI Search endpoint load test documentation](https://docs.databricks.com/aws/en/ai-search/endpoint-load-test).
+
 ## Expanded Troubleshooting
 
 | Issue | Likely Cause | Solution |
@@ -169,9 +208,10 @@ w.vector_search_endpoints.delete_endpoint(endpoint_name="my-endpoint")
 | **Embedding dimension mismatch** | Query vector dimensions ≠ index dimensions | Ensure your embedding model output matches the `embedding_dimension` in the index spec. |
 | **Permission errors on create** | Missing Unity Catalog privileges | User needs `CREATE TABLE` on the schema and `USE CATALOG`/`USE SCHEMA` privileges. |
 | **Index returns NOT_FOUND** | Wrong name format or index deleted | Index names must be fully qualified: `catalog.schema.index_name`. |
-| **Sync not running (TRIGGERED)** | Sync not triggered after source update | Call `manage_vs_index(action="sync")` or `w.vector_search_indexes.sync_index()` after updating source data. |
+| **Sync not running (TRIGGERED)** | Sync not triggered after source update | Call `manage_vs_index(action="sync")` or `index.sync()` after updating source data. |
 | **Endpoint NOT_FOUND** | Endpoint name typo or deleted | List all endpoints with `manage_vs_endpoint(action="list")` to verify available endpoints. |
 | **Query returns empty results** | Index not yet synced, or filters too restrictive | Check index state is ONLINE. Verify `columns_to_sync` includes queried columns. Test without filters first. |
-| **filters_json has no effect** | Using wrong filter syntax for endpoint type | Standard endpoints use dict-format filters (`filters_json` in SDK, `filters` as dict in `databricks-vectorsearch`). Storage-Optimized endpoints use SQL-like string filters (`filters` as str in `databricks-vectorsearch`). |
+| **Filters not working** | Wrong filter syntax for endpoint type | Standard endpoints use a dict: `filters={"col": "val"}`. Storage-Optimized use a SQL string: `filters="col = 'val'"`. See [filtering.md](filtering.md). |
 | **Quota or capacity errors** | Too many indexes or vectors | Check `num_indexes` on endpoint. Consider Storage-Optimized for higher capacity. |
 | **Upsert fails on Delta Sync** | Cannot upsert to Delta Sync indexes | Upsert/delete operations only work on Direct Access indexes. Delta Sync indexes update via their source table. |
+| **High latency (429 errors)** | Endpoint over capacity | Increase endpoint capacity. Implement client-side rate limiting with exponential backoff. |