Skip to content

Add Feature Engineering reference files to databricks-ml-training-serving#570

Closed
Hehehe421 wants to merge 1 commit into
databricks-solutions:experimentalfrom
Hehehe421:feature/feature-store-skill
Closed

Add Feature Engineering reference files to databricks-ml-training-serving#570
Hehehe421 wants to merge 1 commit into
databricks-solutions:experimentalfrom
Hehehe421:feature/feature-store-skill

Conversation

@Hehehe421

Copy link
Copy Markdown

Summary

Extends the databricks-ml-training-serving skill with two new reference files covering Feature Engineering in Unity Catalog.

  • 3-feature-store.md (287 lines) — Standard API (GA). Covers FeatureEngineeringClient, create_table/write_table, FeatureLookup with point-in-time, fe.log_model with lineage, score_batch, and online store publishing via Lakebase.
  • 4-declarative-features.md (296 lines) — Declarative API (Public Preview, >=0.15.0). Covers DeltaTableSource, create_feature with SlidingWindow/TumblingWindow/RollingWindow, materialize_features, and Feature Serving Endpoints. Streaming Declarative Features (Private Preview) referenced as a callout only — not covered in code.
  • SKILL.md — description updated with 14 feature store trigger terms so the skill activates on Feature Engineering prompts (feature store, FeatureLookup, score_batch, publish_table, create_feature, etc.).

Why

The existing skill covered classical ML and MLflow training but did not address Feature Engineering — a core ML pattern for Databricks customers needing training-serving consistency, feature reuse, point-in-time correctness, and real-time lookups.

The existing SKILL.md also referenced `mlflow.pyfunc.spark_udf` as the batch scoring path. For feature-store-backed models this is the wrong API — `fe.score_batch` is required to auto-join feature tables via lineage. The new files document the correct path; the trigger terms in SKILL.md description ensure the agent uses the right one.

Source material

Evaluation

Evaluated with SkillForge L1 (unit) + L3 (static document quality):

  • Composite score: 0.96
  • L1: 1.00 (50/50 link and syntax checks pass)
  • L3: 0.92 (10-dimension document quality rubric)
  • A/B comparison vs baseline: B (with feature store files) wins at 0.95 confidence, no regression flags

Test plan

  • Live tested against summer_catalog.dbdemos_fs_travel in a vm-summer workspace (T1-T4 manual prompts via Databricks Assistant)
  • All code blocks validated as syntactically correct by SkillForge L1
  • All cross-skill references resolve correctly (49/49 link checks pass)
  • SKILL.md description change verified to be the only modification to the existing file

…ving

Extends the skill with two new reference files covering Feature Engineering
in Unity Catalog:

- 3-feature-store.md: Standard API (GA). Covers FeatureEngineeringClient,
  create_table/write_table, FeatureLookup with point-in-time, fe.log_model
  with lineage, score_batch, and online store publishing.

- 4-declarative-features.md: Declarative API (Public Preview, >=0.15.0).
  Covers DeltaTableSource, create_feature with SlidingWindow/TumblingWindow/
  RollingWindow, materialize_features, and Feature Serving Endpoints.

Also updates SKILL.md description with feature store trigger terms so the
skill activates on Feature Engineering prompts.

Evaluated with SkillForge L1+L3: composite 0.96, no regressions.
@Hehehe421

Copy link
Copy Markdown
Author

Migrating contribution to databricks-eng/universe per the latest agent-skills pipeline guidance. Branch: sixuan-he_data/feature/feature-store-skill on databricks-eng/universe-dev. PR will open against universe:master once Tian's declarative-features.md lands.

@Hehehe421 Hehehe421 closed this Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant