Skip to content

fix: enrich missing engine connection field information from plugin definitions#53

Closed
wForget wants to merge 1 commit into
datahub-project:mainfrom
wForget:analytics-agent-52
Closed

fix: enrich missing engine connection field information from plugin definitions#53
wForget wants to merge 1 commit into
datahub-project:mainfrom
wForget:analytics-agent-52

Conversation

@wForget
Copy link
Copy Markdown
Contributor

@wForget wForget commented May 15, 2026

Motivation

Currently, Hive data source fields are not displayed on the Data Sources settings page.

closes #52

Changes

  • Updated analytics_agent.api.settings.list_connections to pass through all conn_cfg.items
  • Enrich connection fields using plugin fields definitions

Testing

Before this:
image

After this:
image

@wForget wForget marked this pull request as draft May 15, 2026 08:34
@wForget wForget marked this pull request as ready for review May 15, 2026 08:42
@wForget wForget force-pushed the analytics-agent-52 branch from 058e024 to 7f6327a Compare May 15, 2026 09:11
Copy link
Copy Markdown
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the careful diagnosis and the patch — the root cause analysis in #52 is exactly right, and the screenshots make the user impact clear. After tracing through what the new frontend .map() does for every engine plugin, though, I think we need to take a different approach before merging. Putting this in REQUEST_CHANGES not as a hard reject — your work is the foundation here — but because as written it introduces three silent regressions and an inconsistent end-state.

Regressions in this change

1. BigQuery credentials_json loses its sensitivity flag and secret_key

The backend currently sends credentials_json as sensitive=True, secret_key="credentials_json" (settings.py:684-691). The new frontend rebuild at SettingsModal.tsx:983-998 re-derives sensitive as pf.type === "password" — but in bigquery.tsx credentials_json has type: "json". So sensitive flips to false and secret_key becomes "". Service-account JSON could render unmasked, and form submission may stop routing it as a secret.

2. URL-form MySQL / Postgres / SQLite connections break

When a SQL connection is configured via url=..., the backend returns a single [url] ConnectionField (settings.py:700-709). But the plugin FIELDS arrays for mysql/postgresql/sqlite don't include a url key — they're [host, port, database, user, password]. The rebuild discards the backend's url field and replaces it with five empty fields. A working URL-configured database appears unconfigured in the UI.

3. Snowflake is on a different code path than the rest

snowflakePlugin exports fields: [] because the form is custom, so the rebuild is skipped for snowflake but applied for every other engine. We end up with two display models in the same listing — backend-driven for snowflake, frontend-rebuilt for everyone else. Same goes for custom-mcp, which now receives backend-side conn_cfg.items() with label=None and renders them with no label at all.

A backend-driven alternative

The backend already has the right abstraction. engines/factory.py::ConnectorSpec (the dataclass at line 12) and _CONNECTOR_MAP (line 78) describe every connector's env_map, secret_env_vars, required_keys, and credential_keys. Hive is already fully registered there (factory.py:100-116). The reason list_connections falls into the empty else for hive is just that the registry lacks display metadata — there's no asymmetry of capability, only of schema.

If we extend ConnectorSpec with a display_fields: list[DisplayField] (each with key, label, placeholder, sensitive, secret_key) and populate it for snowflake / bigquery / hive, then list_connections collapses to a single generic block driven by the spec. The frontend renders verbatim, no client-side rebuild needed. Concretely:

  • Hive renders correctly (closing #52) because its spec now has display metadata.
  • BigQuery keeps its sensitivity and secret_key.
  • URL-form SQL connections are unaffected (the SQLAlchemy branch in list_connections stays as-is because its url vs host+port+db+user split doesn't fit the flat field model — we can model that later if we want).
  • Adding a new connector becomes a one-file change: append to _CONNECTOR_MAP.
  • ConnectionPlugin.fields and label: string | null aren't needed.

Suggested path

I'm happy to push a follow-up PR with that approach — your repro and field schema for Hive in plugins/hive.tsx are exactly what I'd port into the backend spec, so the work isn't redundant. Alternatively, if you'd rather drive it, the changes are localized to engines/factory.py (extend ConnectorSpec, add display_fields) and api/settings.py (collapse the per-type branches). Let me know which you'd prefer.

Thanks again for catching this and writing it up so clearly.

@wForget
Copy link
Copy Markdown
Contributor Author

wForget commented May 19, 2026

@shirshanka Thank you for your detailed reply. The backend-driven approach sounds good to me. It would be even better if you could send a PR to improve it, and feel free to abandon this PR.

@shirshanka
Copy link
Copy Markdown
Contributor

Thanks again @wForget — really appreciate the openness to switching approaches. I've opened #54 with the backend-driven version we discussed, which closes #52.

Closing this one in favor of #54. Could you give it a try against your Hive setup once it merges (or pull the branch fix/connector-display-fields locally) and let us know if the Data Sources page now renders correctly for you? Your repro is the clearest signal we have that #52 is fully fixed.

@shirshanka shirshanka closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(frontend): The hive data source fields are not displayed on the Data Sources settings page

2 participants