feat: OSI (Open Semantic Interchange) v0.1.1 compatibility by hachej · Pull Request #229 · boringdata/boring-semantic-layer

hachej · 2026-04-02T07:23:09Z

Summary

Adds bidirectional conversion between BSL's YAML format and the OSI (Open Semantic Interchange) v0.1.1 spec, addressing #226.

New module osi.py with:

to_osi() / to_osi_yaml() — export BSL models to OSI-compliant YAML
from_osi() / from_osi_yaml() — import OSI YAML into BSL SemanticModel instances

Key design decisions:

BSL Ibis Deferred expressions (_.col.sum()) are translated to SQL strings (SUM(col)) for OSI
BSL-specific metadata (is_entity, is_event_timestamp, smallest_time_grain, derived_dimensions) preserved via OSI custom_extensions for round-trip fidelity
ai_context field added to Dimension and Measure classes — supports both string and structured object (instructions/synonyms/examples) as per OSI spec
Entity dimensions (is_entity=True) automatically map to OSI primary_key
Time dimensions map to OSI dimension.is_time

What's included

File	Description
`src/boring_semantic_layer/osi.py`	Core converter module (export + import)
`src/boring_semantic_layer/ops.py`	Added `ai_context` to `Dimension`/`Measure`; threaded through `_extract_measure_metadata`/`_make_base_measure`
`src/boring_semantic_layer/yaml.py`	Parse `ai_context` from BSL YAML configs
`src/boring_semantic_layer/__init__.py`	Export new functions
`src/boring_semantic_layer/tests/test_osi.py`	48 tests: expression conversion, export, import, round-trips
`examples/flights_osi.yaml`	Flights example in OSI format
`docs/osi-compatibility.md`	Gap analysis document

Gaps remaining for full OSI parity

OSI Feature	Status	Notes
`ai_context` at all levels	Done	On dimensions, measures, and top-level model
`primary_key` / `unique_keys`	Partial	`is_entity` maps to PK; `unique_keys` not yet modeled
Multi-dialect expressions	Partial	Exports as ANSI_SQL; multi-dialect input supported on import
`custom_extensions`	Done	Used for BSL-specific metadata round-trip
`label` on fields	Not yet	Low priority
Relationship join column extraction	Partial	Lambda predicates are hard to introspect
Complex expression translation	Partial	Handles common patterns (SUM, AVG, COUNT, etc.)

Test plan

48 unit tests covering expression conversion, export, import, and round-trips
Existing test suite passes (no regressions from ai_context or _extract_measure_metadata changes)
Manual test with real flights data loading the OSI example

🤖 Generated with Claude Code

Add bidirectional converter between BSL and OSI YAML format: - to_osi() / to_osi_yaml(): Export BSL models to OSI-compliant YAML - from_osi() / from_osi_yaml(): Import OSI YAML into BSL models - ai_context field on Dimension and Measure for LLM metadata - Expression translation between Ibis Deferred and SQL strings - BSL-specific metadata preserved via OSI custom_extensions - Round-trip tested: BSL->OSI->BSL and OSI->BSL->OSI Closes #226 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Instead of a separate from_osi() conversion layer, from_config() now auto-detects OSI format (version + semantic_model keys) and parses it directly. This means from_yaml("model.osi.yaml") just works — BSL natively speaks OSI. - OSI parsing logic moved from osi.py into yaml.py - osi.py slimmed to export-only (to_osi/to_osi_yaml) + expression helpers - from_osi/from_osi_yaml kept as thin aliases to from_config/from_yaml - Removed from_osi/from_osi_yaml from top-level __init__.py exports - Tests updated to use from_config for OSI import (the native path) - Added format detection tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…olumn joins Four fixes to make OSI import near-lossless: 1. primary_key -> is_entity: fields matching dataset.primary_key are automatically marked is_entity=True on import (no custom_extensions needed for standard OSI files) 2. Dataset-level ai_context: added to SemanticTableOp, threaded through to_semantic_table/SemanticModel/with_dimensions/with_measures. Stored as JSON string internally for ibis hashability, deserialized via get_ai_context(). Round-trips through to_osi export. 3. label on Dimension: new optional field, parsed on OSI import, emitted on export. Supports the OSI field.label categorization concept. 4. Multi-column relationship joins: all from_columns/to_columns pairs are now used to build compound join predicates, not just the first. All 56 OSI tests pass. Export validates against official OSI JSON schema. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

P1 fixes: - Calculated measures: extract formula from closure/original_expr instead of emitting the metric name as a self-reference - Join key export: introspect lambda predicates by evaluating against mock tables and walking the Equals expression tree to extract column names, instead of hardcoding ["unknown"] - Unqualified metrics (COUNT(*)): only assign to the first dataset instead of duplicating across all datasets in multi-dataset imports P2 fixes: - Relationship cardinality: read from custom_extensions and use join_many() when cardinality is "many" instead of always join_one() - Expression fallback: return None for non-trivial Ibis expressions instead of stripping "_." prefix which leaks method syntax as invalid SQL P3 fixes: - BSL YAML measure ai_context: pass extra_kwargs["ai_context"] through to Measure() constructor instead of silently dropping it Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

boringdata and others added 4 commits April 2, 2026 07:22

hachej marked this pull request as draft April 7, 2026 11:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: OSI (Open Semantic Interchange) v0.1.1 compatibility#229

feat: OSI (Open Semantic Interchange) v0.1.1 compatibility#229
hachej wants to merge 4 commits intomainfrom
feat/osi-yaml-compat

hachej commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hachej commented Apr 2, 2026

Summary

What's included

Gaps remaining for full OSI parity

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants