Skip to content

Commit e37b320

Browse files
committed
Add battle-tested improvements to app-python, app-apx, and model-serving skills
Real-world fixes from deploying Databricks Apps in production: databricks-app-python: - Add 5 critical MUST/MUST NOT rules: /metrics endpoint, CDN blocking, don't delete-recreate apps, don't upload dev files, React hooks ordering - Fix valueFrom YAML syntax (valueFrom.resource instead of flat valueFrom) - Add CLI API PATCH workflow for attaching resources programmatically - Add valueFrom vs value fallback troubleshooting - Add port binding guidance and DATABRICKS_APP_PORT details per framework - Add "never delete-recreate" warning and clean stale files workflow - Add Lakebase OAuth security label error fix and PGPASSWORD workaround - Add Lakebase Sync (reverse ETL) quick-start instructions - Add 4 new troubleshooting entries: App Not Available, blank frontend, React Error #310, gateway health check databricks-app-apx: - Add React Error #310 (hooks ordering) troubleshooting entry databricks-model-serving: - Add system.ai.similarity_search to built-in UC functions table
1 parent c27d3f2 commit e37b320

7 files changed

Lines changed: 173 additions & 23 deletions

File tree

databricks-skills/databricks-app-apx/SKILL.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,7 @@ Create two markdown files:
236236
**TypeScript errors**: Wait for OpenAPI regen, verify hook names match operation_ids
237237
**OpenAPI not updating**: Check watcher status with `apx dev status`, restart if needed
238238
**Components not added**: Run shadcn from project root with `--yes` flag
239+
**React page crashes to blank after data loads (Error #310)**: `useMemo`/`useCallback` hooks placed after early returns (`if (loading) return <Spinner />`) violate React Rules of Hooks. Hooks must be called in the same order on every render. Move ALL hooks before any conditional returns and guard their internals instead: `useMemo(() => { if (!data.length) return []; ... }, [data])`
239240

240241
## Reference Materials
241242

databricks-skills/databricks-app-python/2-app-resources.md

Lines changed: 39 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,21 +31,39 @@ Use `valueFrom` to reference resources — never hardcode IDs:
3131
```yaml
3232
env:
3333
- name: DATABRICKS_WAREHOUSE_ID
34-
valueFrom: sql-warehouse
34+
valueFrom:
35+
resource: sql-warehouse
3536

3637
- name: SERVING_ENDPOINT_NAME
37-
valueFrom: serving-endpoint
38+
valueFrom:
39+
resource: serving-endpoint
3840

3941
- name: DB_CONNECTION_STRING
40-
valueFrom: database
42+
valueFrom:
43+
resource: database
4144
```
4245
43-
Add resources via the Databricks Apps UI when creating or editing an app:
46+
Add resources via the Databricks Apps UI or CLI:
47+
48+
**Option 1: UI**
4449
1. Navigate to Configure step
4550
2. Click **+ Add resource**
4651
3. Select resource type and set permissions
4752
4. Assign a key (referenced in `valueFrom`)
4853

54+
**Option 2: CLI (API PATCH)** — required when deploying programmatically. Without resources attached, the gateway shows "App Not Available" even if the process is running:
55+
56+
```bash
57+
databricks api patch /api/2.0/apps/<app-name> --json '{
58+
"resources": [
59+
{"name": "sql-warehouse", "sql_warehouse": {"id": "<warehouse-id>", "permission": "CAN_USE"}},
60+
{"name": "serving-endpoint", "serving_endpoint": {"name": "<endpoint-name>", "permission": "CAN_QUERY"}}
61+
]
62+
}' --profile <profile>
63+
```
64+
65+
**CRITICAL**: Resources must be attached BEFORE deploying. Without them, the gateway will refuse to serve the app even though the process is running and healthy.
66+
4967
---
5068

5169
## Communication Strategies
@@ -112,9 +130,26 @@ For Lakebase patterns, see [5-lakebase.md](5-lakebase.md).
112130

113131
---
114132

133+
## Troubleshooting: `valueFrom` vs `value`
134+
135+
If `valueFrom: resource:` fails with "Error reading app.yaml", use hardcoded `value:` as a fallback:
136+
137+
```yaml
138+
env:
139+
- name: DATABRICKS_WAREHOUSE_ID
140+
value: "<actual-warehouse-id>"
141+
- name: SERVING_ENDPOINT_NAME
142+
value: "<actual-endpoint-name>"
143+
```
144+
145+
This can happen when resources aren't yet attached to the app or the resource key doesn't match. Prefer `valueFrom` when resources are properly configured, but use `value` to unblock deployment.
146+
147+
---
148+
115149
## Best Practices
116150

117151
- Always use `valueFrom` — keeps apps portable between environments
152+
- If `valueFrom` fails with "Error reading app.yaml", fall back to `value:` with hardcoded IDs (see above)
118153
- Grant service principal minimum required permissions (e.g., `CAN USE` not `CAN MANAGE` for SQL warehouse)
119154
- Use Lakebase for transactional workloads; SQL warehouse for analytical workloads
120155
- For external services, use UC connections or secrets (never hardcode API keys)

databricks-skills/databricks-app-python/3-frameworks.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ def get_connection():
6464
| Detail | Value |
6565
|--------|-------|
6666
| Pre-installed version | 1.38.0 |
67-
| app.yaml command | `["streamlit", "run", "app.py"]` |
67+
| app.yaml command | `["streamlit", "run", "app.py"]` — port, address, and headless are auto-configured by the runtime via `DATABRICKS_APP_PORT` |
6868
| Auth header | `st.context.headers.get('x-forwarded-access-token')` |
6969

7070
**Databricks tips**:
@@ -152,7 +152,7 @@ def get_data():
152152
| Detail | Value |
153153
|--------|-------|
154154
| Pre-installed version | 3.0.3 |
155-
| app.yaml command | `["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]` |
155+
| app.yaml command | `["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]` — uses `DATABRICKS_APP_PORT` default (8000) |
156156
| Auth header | `request.headers.get('x-forwarded-access-token')` |
157157

158158
**Databricks tips**:
@@ -192,7 +192,7 @@ async def get_data(request: Request):
192192
| Detail | Value |
193193
|--------|-------|
194194
| Pre-installed version | 0.115.0 |
195-
| app.yaml command | `["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]` |
195+
| app.yaml command | `["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]` — uses `DATABRICKS_APP_PORT` default (8000) |
196196
| Auth header | `request.headers.get('x-forwarded-access-token')` via `Request` |
197197

198198
**Databricks tips**:
@@ -244,5 +244,7 @@ class State(rx.State):
244244
- Add only additional packages your app needs to `requirements.txt`
245245
- SDK `Config()` auto-detects credentials from injected environment variables
246246
- Apps must bind to `DATABRICKS_APP_PORT` env var (defaults to 8000). Streamlit is auto-configured by the runtime; for other frameworks, read the env var in code or hardcode 8000 in `app.yaml` command. **Never use 8080**
247+
- **No external CDN dependencies** in frontend HTML — the app runtime blocks outbound CDN requests (React, Recharts, Google Fonts, Babel). Build self-contained HTML with inline JS/CSS only
248+
- **Never delete apps to fix issues** — just redeploy. Deleting disrupts OAuth integration
247249
- For framework-specific deployment commands, see [4-deployment.md](4-deployment.md)
248250
- For authorization integration, see [1-authorization.md](1-authorization.md)

databricks-skills/databricks-app-python/4-deployment.md

Lines changed: 48 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ command:
1919

2020
env:
2121
- name: DATABRICKS_WAREHOUSE_ID
22-
valueFrom: sql-warehouse
22+
valueFrom:
23+
resource: sql-warehouse
2324
- name: USE_MOCK_BACKEND
2425
value: "false"
2526
```
@@ -28,13 +29,15 @@ env:
2829
2930
| Framework | Command |
3031
|-----------|---------|
31-
| Dash | `["python", "app.py"]` |
32-
| Streamlit | `["streamlit", "run", "app.py"]` |
33-
| Gradio | `["python", "app.py"]` |
32+
| Dash | `["python", "app.py"]` — bind to `DATABRICKS_APP_PORT` in code |
33+
| Streamlit | `["streamlit", "run", "app.py"]` — port/address/headless auto-configured by runtime |
34+
| Gradio | `["python", "app.py"]` — bind to `DATABRICKS_APP_PORT` in code |
3435
| Flask | `["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]` |
3536
| FastAPI | `["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]` |
3637
| Reflex | `["reflex", "run", "--env", "prod"]` |
3738

39+
**Port binding**: Apps must listen on `DATABRICKS_APP_PORT` (defaults to 8000). Streamlit is auto-configured. For Flask/FastAPI, 8000 in the command matches the default. For Dash/Gradio, read the env var in code: `int(os.environ.get("DATABRICKS_APP_PORT", 8000))`. **Never use 8080.**
40+
3841
### Step 2: Create and Deploy
3942

4043
```bash
@@ -57,9 +60,19 @@ databricks apps get <app-name>
5760

5861
### Redeployment
5962

63+
**NEVER delete and recreate an app to fix deployment issues** — just redeploy. Deleting disrupts OAuth integration and doesn't fix underlying problems.
64+
65+
**Clean stale files** before redeploying — leftover files (e.g., old `main.py`) in the workspace source path can cause conflicts:
66+
6067
```bash
61-
databricks workspace delete /Workspace/Users/<user>/apps/<app-name> --recursive
62-
databricks workspace import-dir . /Workspace/Users/<user>/apps/<app-name>
68+
# Check for stale files
69+
databricks workspace list /Workspace/Users/<user>/apps/<app-name>
70+
71+
# Remove stale files if needed
72+
databricks workspace delete /Workspace/Users/<user>/apps/<app-name>/<stale-file>
73+
74+
# Sync and redeploy
75+
databricks sync . /Workspace/Users/<user>/apps/<app-name> --full
6376
databricks apps deploy <app-name> \
6477
--source-code-path /Workspace/Users/<user>/apps/<app-name>
6578
```
@@ -115,6 +128,35 @@ For programmatic app lifecycle management, see [6-mcp-approach.md](6-mcp-approac
115128

116129
## Post-Deployment
117130

131+
### Attach Resources (CRITICAL)
132+
133+
Without resources attached, the gateway shows "App Not Available" even if the process is running. Attach resources via API PATCH **before** deploying:
134+
135+
```bash
136+
databricks api patch /api/2.0/apps/<app-name> --json '{
137+
"resources": [
138+
{"name": "sql-warehouse", "sql_warehouse": {"id": "<warehouse-id>", "permission": "CAN_USE"}},
139+
{"name": "serving-endpoint", "serving_endpoint": {"name": "<endpoint-name>", "permission": "CAN_QUERY"}}
140+
]
141+
}' --profile <profile>
142+
```
143+
144+
**Find the correct warehouse ID** for the target workspace:
145+
```bash
146+
databricks warehouses list --profile <profile>
147+
```
148+
149+
### Configure Permissions
150+
151+
```bash
152+
databricks api put /api/2.0/permissions/apps/<app-name> --json '{
153+
"access_control_list": [
154+
{"user_name": "<your-email>", "permission_level": "CAN_MANAGE"},
155+
{"group_name": "users", "permission_level": "CAN_USE"}
156+
]
157+
}' --profile <profile>
158+
```
159+
118160
### Check Logs
119161

120162
```bash
@@ -134,9 +176,3 @@ databricks apps logs <app-name>
134176
2. Check all pages load correctly
135177
3. Verify data connectivity (look for backend initialization messages in logs)
136178
4. Test user authorization flow if enabled
137-
138-
### Configure Permissions
139-
140-
- Set `CAN USE` for approved users/groups
141-
- Set `CAN MANAGE` only for trusted developers
142-
- Verify service principal has required resource permissions

databricks-skills/databricks-app-python/5-lakebase.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,8 +134,77 @@ asyncpg
134134

135135
**This is the most common cause of Lakebase app failures.**
136136

137+
## Troubleshooting: OAuth / Security Label Errors
138+
139+
When the app's service principal connects to Lakebase via OAuth (i.e. `PGPASSWORD` is not auto-injected), you may see:
140+
141+
```
142+
FATAL: An oauth token was supplied but no role security label was configured in postgres for role "<SP_CLIENT_ID>"
143+
```
144+
145+
**Root cause**: The SP's PostgreSQL role exists but lacks the `databricks_auth` security label that maps it to a Databricks identity.
146+
147+
**Fix**: Connect as the instance owner and set the security label:
148+
149+
```sql
150+
-- 1. Find the SP's numeric ID (from Databricks workspace)
151+
-- databricks service-principals list -o json | grep <SP_CLIENT_ID>
152+
-- Look for the "id" field (numeric)
153+
154+
-- 2. Set the security label in Lakebase
155+
SECURITY LABEL FOR databricks_auth ON ROLE "<SP_CLIENT_ID>"
156+
IS 'id=<SP_NUMERIC_ID>,type=SERVICE_PRINCIPAL';
157+
158+
-- 3. Grant schema/table access
159+
GRANT USAGE ON SCHEMA my_schema TO "<SP_CLIENT_ID>";
160+
GRANT SELECT ON ALL TABLES IN SCHEMA my_schema TO "<SP_CLIENT_ID>";
161+
```
162+
163+
You can verify with: `SELECT * FROM pg_seclabels WHERE objtype = 'role';`
164+
165+
**When PGPASSWORD is empty**: If Databricks auto-injects `PGHOST` and `PGUSER` but NOT `PGPASSWORD`, use the SDK to generate an OAuth token:
166+
167+
```python
168+
from databricks.sdk import WorkspaceClient
169+
import uuid
170+
171+
w = WorkspaceClient()
172+
cred = w.database.generate_database_credential(
173+
request_id=str(uuid.uuid4()),
174+
instance_names=["my-lakebase-instance"],
175+
)
176+
# Use cred.token as the password
177+
```
178+
179+
---
180+
181+
## Lakebase Sync (Reverse ETL from Delta)
182+
183+
To sync Unity Catalog Delta tables to Lakebase for low-latency serving:
184+
185+
1. Add primary keys to source Delta tables (required):
186+
```sql
187+
ALTER TABLE catalog.schema.my_table ALTER COLUMN id SET NOT NULL;
188+
ALTER TABLE catalog.schema.my_table ADD CONSTRAINT pk PRIMARY KEY (id);
189+
```
190+
191+
2. Create synced tables via CLI:
192+
```bash
193+
databricks database create-synced-database-table --json '{
194+
"name": "catalog.schema.lb_my_table",
195+
"source_table_full_name": "catalog.schema.my_table",
196+
"scheduling_policy": {"snapshot": {}},
197+
"primary_key_columns": ["id"]
198+
}'
199+
```
200+
201+
The `name` field is the **destination** UC table pointer (use a prefix like `lb_` to avoid conflicts with the source table).
202+
203+
---
204+
137205
## Notes
138206

139207
- Lakebase is in **Public Preview**
140208
- Each app gets its own PostgreSQL role with `Can connect and create` permission
141209
- Lakebase is ideal alongside SQL warehouse: use Lakebase for app state, SQL warehouse for analytics
210+
- When using Lakebase Sync, synced tables appear in the Lakebase schema with the prefix you chose

databricks-skills/databricks-app-python/SKILL.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ Build Python-based Databricks applications. For full examples and recipes, see t
1717
- **MUST** use `dash-bootstrap-components` for Dash app layout and styling
1818
- **MUST** use `@st.cache_resource` for Streamlit database connections
1919
- **MUST** deploy Flask with Gunicorn, FastAPI with uvicorn (not dev servers)
20+
- **MUST NOT** use external CDN links in frontend HTML (React, Recharts, Google Fonts, Babel, etc.) — the app runtime blocks outbound CDN requests. Use self-contained inline JS/CSS only
21+
- **MUST NOT** delete and recreate apps to fix deployment issues — just redeploy. Deleting disrupts OAuth integration
22+
- **MUST NOT** upload `node_modules/`, `frontend/src/`, `__pycache__/`, or other dev-only files to the workspace when deploying. For React/Vite apps with a FastAPI backend, only upload: `app.py`, `backend.py`, `requirements.txt`, `app.yaml`, and the `static/` build output folder. Use targeted `databricks workspace import` for individual files and `databricks workspace import-dir static <ws-path>/static` for the build — never `import-dir .` from the app root
23+
- **MUST** (React) place ALL hooks (`useState`, `useEffect`, `useMemo`, `useCallback`, `useRef`) BEFORE any early return statements in React components. React requires hooks to be called in the exact same order on every render. Placing `useMemo`/`useCallback` after `if (loading) return <Spinner />` causes "Rendered fewer hooks than expected" (React Error #310) — the component calls fewer hooks on the loading render than on the data-loaded render, crashing the entire page to blank. Move all hooks to the top of the function body, guard their internals with `if (!data.length) return []` instead
2024

2125
## Required Steps
2226

@@ -35,9 +39,9 @@ Copy this checklist and verify each item:
3539

3640
| Framework | Best For | app.yaml Command |
3741
|-----------|----------|------------------|
38-
| **Dash** | Production dashboards, BI tools, complex interactivity | `["python", "app.py"]` |
39-
| **Streamlit** | Rapid prototyping, data science apps, internal tools | `["streamlit", "run", "app.py"]` |
40-
| **Gradio** | ML demos, model interfaces, chat UIs | `["python", "app.py"]` |
42+
| **Dash** | Production dashboards, BI tools, complex interactivity | `["python", "app.py"]` — bind to `DATABRICKS_APP_PORT` in code |
43+
| **Streamlit** | Rapid prototyping, data science apps, internal tools | `["streamlit", "run", "app.py"]` — port/address/headless auto-configured by runtime |
44+
| **Gradio** | ML demos, model interfaces, chat UIs | `["python", "app.py"]` — bind to `DATABRICKS_APP_PORT` in code |
4145
| **Flask** | Custom REST APIs, lightweight apps, webhooks | `["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]` |
4246
| **FastAPI** | Async APIs, auto-generated OpenAPI docs | `["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]` |
4347
| **Reflex** | Full-stack Python apps without JavaScript | `["reflex", "run", "--env", "prod"]` |
@@ -174,7 +178,9 @@ class EntityIn(BaseModel):
174178
| **Streamlit: set_page_config error** | `st.set_page_config()` must be the first Streamlit command |
175179
| **Dash: unstyled layout** | Add `dash-bootstrap-components`; use `dbc.themes.BOOTSTRAP` |
176180
| **Slow queries** | Use Lakebase for transactional/low-latency; SQL warehouse for analytical queries |
177-
181+
| **"App Not Available" after deploy** | Ensure resources are attached via API PATCH before deploying; verify app binds to `DATABRICKS_APP_PORT` |
182+
| **Frontend loads blank/black** | External CDN requests (React, Recharts, Google Fonts, Babel) are blocked by the app runtime. Use self-contained inline JS/CSS only — no external `<script>` or `<link>` tags |
183+
| **React page crashes to blank after data loads** | `useMemo`/`useCallback` hooks placed after early returns (`if (loading) return ...`) violate React Rules of Hooks. Move ALL hooks before any conditional returns. Guard hook internals instead: `useMemo(() => { if (!data.length) return []; ... }, [data])` |
178184
---
179185

180186
## Platform Constraints

databricks-skills/databricks-model-serving/4-tools-integration.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ uc_toolkit = UCFunctionToolkit(
3939
| Function | Purpose |
4040
|----------|---------|
4141
| `system.ai.python_exec` | Execute Python code |
42+
| `system.ai.similarity_search` | Vector similarity search |
4243

4344
### Creating a UC Function
4445

0 commit comments

Comments
 (0)