Skip to content

fix: classifier-off single-type fallback + request-scope transform rows#31

Merged
ancongui merged 2 commits into
mainfrom
fix/classifier-fallback-and-transform-rows
Jun 12, 2026
Merged

fix: classifier-off single-type fallback + request-scope transform rows#31
ancongui merged 2 commits into
mainfrom
fix/classifier-fallback-and-transform-rows

Conversation

@ancongui

Copy link
Copy Markdown
Contributor

Two robustness fixes surfaced while wiring structured array fields + cross-document consolidation through the IDP layer.

1. Classifier off + no expected_type silently produced zero documents

When stages.classifier was off and a file carried no expected_type, the segment stayed unmatched and the file yielded no document — with no error. Callers that disable the classifier for a known single type (the common IDP path) silently got nothing back.

A single-row file now defaults to the sole declared document_type in that case, mirroring the single-candidate shortcut the classifier step itself takes (_step_classifier, len(document_types) == 1).

2. Request-scope LLM transformation returned empty rows

The transformer's output model wrapped each row under a values key (_TransformRow.values), but transform.yaml instructs the model to emit flat {field: value} rows. So the structured output never matched and every consolidated row came back empty — result.request_transformations was unusable for scope: request consolidation (e.g. a cap table merged across deeds).

The output row is now a flat dict[str, Any], matching the prompt 1:1; _rebuild_rows reads it directly.

Verification

Reproduced live: a scope: request LLM transformation over several escrituras with a cap_table array group now returns a populated, consolidated cap table under request_transformations (one row per current shareholder with NIF, participaciones, %, clase). Before, the rows were empty.

Release 26.6.5.

Andrés Contreras Guillén added 2 commits June 12, 2026 21:50
Two robustness fixes surfaced while wiring structured array fields + cross-document
consolidation through the IDP:

1. classifier off + no expected_type silently extracted nothing. When the classifier
   stage is disabled and the caller pins no type, a single-row file now defaults to
   the sole declared document_type instead of leaving the segment `unmatched` (which
   produced zero documents with no error). Mirrors the classifier's own single-
   candidate shortcut.

2. request-scope LLM transformations returned empty rows. The transformer output
   wrapped each row under a `values` key while the prompt emits flat {field: value}
   rows, so the structured output never matched and consolidated rows came back
   empty. The row is now a flat dict, matching the prompt — result.request_transformations
   carries populated rows (e.g. a cap table consolidated across deeds).

Release 26.6.5.
@ancongui ancongui merged commit bbe2951 into main Jun 12, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant