Decompose CollectionDAO & EntityRepository; fix lineage/usage/isBot bugs#28778
Decompose CollectionDAO & EntityRepository; fix lineage/usage/isBot bugs#28778harshach wants to merge 15 commits into
Conversation
…e/usage/isBot bugs Decompose the two backend god-classes into cohesive, single-responsibility collaborators (behavior-preserving) and fix several DAO-layer perf/correctness bugs. CollectionDAO (15,040 -> 433 lines): split into 17 same-package domain aggregator interfaces it `extends` (composite-interface-inheritance, zero call-site change): RdfInfraDAOs, SearchReindexDAOs, AiGovernanceDAOs, FeedDAOs, ClassificationTagDAOs, TimeSeriesDAOs, ActivityAuditDAOs, GovernanceDAOs, EventSubscriptionDAOs, KnowledgeAssetDAOs, SystemTokenDAOs, DataAssetServiceDAOs, EntityDataDAOs, AccessControlDAOs, WorkflowDocStoreDAOs, OAuthDAOs, CoreRelationshipDAOs. Added CollectionDAOCompositionTest guarding the JDBI wiring of inherited accessors. EntityRepository (12,797 -> ~9,950 lines): extracted 7 collaborators with delegators preserving the public/protected API for 81 subclasses: EntityTaskWorkflows, EntityCacheLoaders, EntityCaches, EntityCacheInvalidator, InheritanceParentCache, BulkFieldFetcher, BulkImportService. Perf/correctness fixes: - Lineage findTo/FromPipeline operator-precedence bug (restores the relation filter) - entity_usage(entityType,usageDate) index for UsageDAO.computePercentile (2.0.0 migration) - Postgres isBot generated column read $.deleted instead of $.isBot (2.0.0 migration); bots were counted as daily-active users on Postgres - getMaxLastActivityTime / user-list isBot filters use the indexed columns - deleted dead, malformed EntityExtensionDAO.update Full openmetadata-service test suite passes (5,800 tests, 0 failures). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✅ PR checks passedThe linked issue has a description and all required Shipping project fields set. Thanks! |
Resolve CollectionDAO.java conflict from the CollectionDAO split refactor: - Port #28752 (chunk IN-list batch queries under the DB parameter ceiling) onto the moved DAOs: EntityExtensionDAO/EntityRelationshipDAO now live in CoreRelationshipDAOs; TagUsageDAO in ClassificationTagDAOs; UsageDAO in AccessControlDAOs; DataQualityDataTimeSeriesDAO/TestCaseResultTimeSeriesDAO in TimeSeriesDAOs. EntityDAO.queryInChunks/updateInChunks (auto-merged) drive the chunking; F1 dead-code delete and P1 lineage parenthesization preserved in EntityExtensionDAO/EntityRelationshipDAO. - Add IntakeFormDAO from #27600 (ODPS/Custom Forms) to the composite CollectionDAO (accessor + interface + IntakeForm import). - Redirect InListChunkingTest imports to the new declaring interfaces. Service module compiles (main + test); InListChunkingTest (16) and CollectionDAOCompositionTest (2) pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nup (E1) A3 — getContractByEntityId full-scanned data_contract_entity filtering nested $.entity.id / $.entity.type via JSON_EXTRACT on the per-entity contract enforcement write path. Add VIRTUAL (MySQL) / STORED (Postgres) generated columns contractEntityId / contractEntityType + a composite index, and rewrite the query to seek on those columns plus the existing indexed `deleted` column. Validated on MySQL 8.0 + Postgres 16: generated columns populate, query returns only non-deleted matches, EXPLAIN shows an index seek (ref / Index Scan), and the guarded migration is idempotent. E1 — TagUsageCleanup.performCleanup walked the full tag_usage table with offset += batchSize (O(n^2) deep-offset). Switch to keyset pagination on the unique (source, tagFQNHash, targetFQNHash) key. Per-engine SQL: MySQL needs the expanded-OR form (its optimizer makes it an index range seek; a row-constructor comparison scans the whole index from the start — verified Handler_read_next), Postgres seeks optimally with the row-constructor (Index Only Scan). Validated on both engines: a batchSize=7 drive over 300 rows covers all 300 exactly once with zero duplicates. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (C1) FieldRelationshipDAO was the one high-volume relationship DAO with no batch insert, so FeedRepository.storeMentions and TaskRepository.storeMentions fired one round trip per distinct mention. Add insertMany(List<FieldRelationship>) mirroring EntityRelationshipDAO.bulkInsertTo: a @BindBeanList multi-row VALUES with INSERT IGNORE (MySQL) / ON CONFLICT (fromFQNHash, toFQNHash, relation) DO NOTHING (Postgres), omitting the nullable json column so no per-row ::jsonb cast is needed. Both mention callers now build the bean list (pre-hashing FQNs via FullyQualifiedName.buildHash — the same hash @BindFQN applies) and issue one insertMany; empty lists are guarded so the VALUES clause is never empty. Pipeline task-owner inserts are intentionally left on the single-row path: they pass buildHash(fqn) into a @BindFQN param (a separate double-hash concern), so a bean-bound batch would change the stored hash — out of scope here. Validated on MySQL 8.0 + Postgres 16: a multi-row batch inserts all distinct rows, a follow-up batch with duplicate keys is ignored with no error, and json stays NULL — matching the single-row insert semantics. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… 2 finish) The CollectionDAO split left ~12 cross-domain row/mapper carriers (system entity/service counts + pipeline reporting trends/metrics) inside the composite, referenced as CollectionDAO.X from the aggregators and PipelineRepository — the last thing keeping CollectionDAO from being a thin accessor list. Move them into a new leaf interface SharedRowMappers and redirect all 21 references to their canonical SharedRowMappers.X name (no CollectionDAO dependency, breaking the aggregator->CollectionDAO type cycle). CollectionDAO drops 470 -> 150 lines (now just the extends clause, 4 accessors, IntakeFormDAO, AssetDAO). Pure type relocation, no behavior/SQL change; compiles main+test, CollectionDAOCompositionTest passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…CollectionDAO/EntityRepository split The CollectionDAO/EntityRepository decomposition moved member types into aggregator interfaces and static caches into EntityCaches, but the redirect never covered the openmetadata-integration-tests module, breaking its compile: - SearchIndexRetryQueueIT imported CollectionDAO.SearchIndexRetryQueueDAO; that type now lives in SearchReindexDAOs, and a single-type import requires the canonical declaring interface (inherited member types can't be imported via a subtype). Repointed both imports to SearchReindexDAOs. - EntityCacheInvalidationIT referenced EntityRepository.CACHE_WITH_ID; that field moved to EntityCaches (static fields aren't inherited via the subclass name). The remaining "cannot find symbol: log" errors CI reported were Lombok cascade fallout from these two — they clear once the real errors are fixed. Integration -tests module now compiles (BUILD SUCCESS). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…llow-up) An adversarial audit (proven on a live MySQL container) found the E1 keyset cleanup could silently skip rows: tagFQNHash/targetFQNHash are nullable, and once the cursor lands on a NULL-hash row (MySQL sorts NULLs first, so this is reached early), every subsequent `tagFQNHash > NULL` / `= NULL` / row-constructor compare is UNKNOWN, the batch returns empty, and the scan terminates — skipping all trailing rows. The two engine forms also diverge on NULL ordering. Fix: exclude NULL hashes from the keyset scan (AND tagFQNHash IS NOT NULL AND targetFQNHash IS NOT NULL) so the cursor only ever holds real keys. This keeps the index seek (verified: MySQL type=range, Postgres Index Only Scan) and makes both engines scan an identical row set regardless of their NULL ordering. The rare malformed NULL-hash rows (no write path produces them) are swept once via a new bounded getTagUsagesWithNullHash query so the cleanup tool still sees them. Validated on MySQL 8.0 + Postgres 16: with NULL rows present, a batchSize=2 walk now covers all non-NULL rows (0 dropped, 0 dupes) and the NULL sweep catches the malformed rows; the pre-fix walk terminated early. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mPropertyValidator An adversarial architecture review of the remaining EntityRepository clusters found the big ones (EntityUpdater, inheritance/lifecycle template hooks, CSV stubs) must stay — they are polymorphic seams 20-63 subclasses override, and extracting them would force pervasive r.hook() callbacks. The one genuinely clean, callback-free win all three reviewers named independently: the static extension/custom-property validators (~210 lines), which carry zero entity-instance state and whose siblings (validateCustomPropertyEntityReference[List]) already live in EntityUtil. Move validateExtension(Object,String), validateAndTransformExtension, validateHyperlinkUrl, getFormattedDateTimeField, validateTableType, validateEnumKeys to a new org.openmetadata.service.util.CustomPropertyValidator. EntityRepository keeps the instance validateExtension(T,boolean) create/update hook, now delegating to it. Redirected the two external callers (ColumnRepository, EntityCsv) and dropped the two now-unused static imports. EntityRepository -213 lines; pure move, no behavior change. Service (main+test) and integration-tests modules both compile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… (A9) deleteReportDataTypeAtDate's Postgres form recomputed DATE(TO_TIMESTAMP((json->>'timestamp')::bigint/1000)) per row — non-sargable, a full seq scan across all report types. Replace both engine forms with a single half-open range over the indexed `timestamp` column: WHERE entityFQNHash = :reportDataType AND timestamp >= :startTs AND timestamp < :endTs The caller derives [startTs, endTs) from the yyyy-MM-dd date in UTC via the existing TimestampUtils helpers — matching how the date string and the Elasticsearch cleanup (doc['timestamp'].toLocalDate()) are already built, so semantics are preserved while the query becomes identical and index-backed on both engines. Container-validated on MySQL 8.0 (EXPLAIN type=range) and Postgres 16 (Index Scan using idx_report_data_ts_keyset): exactly the target UTC day is deleted; the preceding-day-23:59, next-day-00:00, and other-entity boundary rows are preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ers (B4) The bulk field fetchers batched their relationship records but then resolved each EntityReference one row at a time via Entity.getEntityReferenceById — an N+1 over a page of users/teams (teams, roles, personas, default/inherited personas, domains; team users, defaultRoles, defaultPersona, parents, policies). Add a shared EntityRepository.batchResolveRefs(entityType, ids) that resolves a homogeneous id set in one getEntityReferencesByIds call and returns an id->ref map, and route all 13 single-type fetchers through it. The mixed-type owns/follows fetchers keep their per-row resolution: they resolve heterogeneous entity types with defensive per-row try/catch for dangling owned/followed entities, which a batched find() (throws on the first missing id) would regress. Full service suite green (5,892 tests, 0 failures). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve 2 conflicts from the CollectionDAO/EntityRepository split meeting two upstream fixes: - #28771 (server audit entries to audit.log) changed the audit-log DAO insert to return int; that DAO moved to ActivityAuditDAOs in this branch, so ported the void->int signature there (AuditLogRepository now reads the affected-row count). Took the restructured CollectionDAO (--ours). - #28720 (lost-update race fix in PolicyConditionUpdater) rewrote the inline rewrite loop into rewriteMatchingPolicies/rewriteSinglePolicy/casWritePolicy with a FOR UPDATE + CAS retry. Took the upstream rewrite and re-applied this branch's cache-invalidation redirect (EntityRepository.invalidateCacheForEntity -> EntityCacheInvalidator.invalidateCacheForEntity) inside casWritePolicy; kept both the EntityDAO (upstream) and EntityCacheInvalidator (ours) imports. Service (main+test) and integration-tests compile; CollectionDAOCompositionTest, AuditLogRepositoryTest (3), and PolicyConditionUpdaterTest (41) pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…broke fresh installs
CI integration tests (MySQL: 220 errors, Python ometa: 8 failures) failed with
"Unknown column 'contractEntityId'" from DataContractDAO.getContractByEntityId.
The A3 optimization rewrote that query to read VIRTUAL/STORED generated columns
added by the 2.0.0 migration, but the columns are absent in the fresh-install /
test-provisioned schema (the MySQL guarded ADD COLUMN did not materialize them),
so every getContractByEntityId call — including BaseEntityIT.get_entityDataContract_200
run for every entity type — threw a SQL syntax error.
Restore the original JSON-path query (JSON_EXTRACT / json#>>'{...}') that works on
any schema, and drop the generated-column + index migration blocks from both
2.0.0 schemaChanges.sql files. data_contract_entity is a small write-path table,
so the lost index seek is negligible; correctness on fresh installs matters more.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions
deleteLineage and deleteLineageByFQN emit an ENTITY_LINEAGE_DELETED change event,
but deleteLineageBySource (the DELETE .../lineage/{entityType}/{entityId}/{lineageSource}
endpoint, used to drop all pipeline/OpenLineage edges into an entity) deleted the
relations without emitting any event, so consumers/alerts subscribed to lineage-deleted
events silently missed source-based bulk deletions.
Emit ENTITY_LINEAGE_DELETED for every removed relation (reusing the already-loaded
relations and resolveRefForCacheInvalidation), threading the authenticated principal
through as deletedBy from the resource endpoint.
Addresses PR #28778 review finding #2. Service main+test compile; LineageRepositoryTest
(18) passes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
|



Describe your changes:
Fixes #28785
I decomposed the two backend "god-classes" —
CollectionDAO(15,040 → 433 lines) andEntityRepository(12,797 → ~9,950 lines) — into cohesive, single-responsibility collaborators with no behavior change, and fixed several DAO-layer perf/correctness bugs found along the way.CollectionDAOis now a thin composite thatextends17 same-package domain aggregator interfaces (composite-interface-inheritance — zero call-site change), andEntityRepositorydelegates to 7 extracted collaborators while keeping its public/protected API intact for all 81 subclasses. The wholeopenmetadata-servicetest suite passes (5,800 tests, 0 failures), and a newCollectionDAOCompositionTestguards the JDBI wiring of the inherited@CreateSqlObjectaccessors. Each extraction was validated against the full suite, one collaborator at a time.Type of change:
High-level design:
CollectionDAO split (17 interfaces):
RdfInfraDAOs,SearchReindexDAOs,AiGovernanceDAOs,FeedDAOs,ClassificationTagDAOs,TimeSeriesDAOs,ActivityAuditDAOs,GovernanceDAOs,EventSubscriptionDAOs,KnowledgeAssetDAOs,SystemTokenDAOs,DataAssetServiceDAOs,EntityDataDAOs,AccessControlDAOs,WorkflowDocStoreDAOs,OAuthDAOs,CoreRelationshipDAOs. JDBI'sSqlObjectFactorydiscovers inherited@CreateSqlObjectaccessors viagetMethods(), sodaoCollection.xDAO()andCollectionDAO.Xreferences resolve unchanged; only member-type import statements were repointed to the new declaring interface (compiler-enforced).EntityRepository collaborators (7):
EntityTaskWorkflows,EntityCacheLoaders,EntityCaches,EntityCacheInvalidator,InheritanceParentCache,BulkFieldFetcher,BulkImportService. Each holds a back-reference to the repository where needed and is reached via thin delegators, preserving overridable hooks and the protected API for the 62EntityUpdatersubclasses / 81 repositories.EntityUpdateritself was intentionally left in place (its 62-subclass inheritance coupling warrants a separate, dedicated refactor).Bug/perf fixes: lineage
findTo/FromPipelineoperator-precedence bug (therelationfilter was being dropped);entity_usage(entityType, usageDate)index forUsageDAO.computePercentile; PostgresisBotgenerated column was reading$.deletedinstead of$.isBot(bots were counted as daily-active users);getMaxLastActivityTime/user-list isBot filters now use indexed columns; removed dead, malformedEntityExtensionDAO.update. Migrations are idempotent and live underbootstrap/sql/migrations/native/2.0.0/{mysql,postgres}/schemaChanges.sql.Backward-compat: no public API or call-site changes; cache fields/inheritance behavior preserved via delegation; DB migrations are
IF NOT EXISTS/guarded and validated on MySQL 8.0 + Postgres 16.Tests:
Unit tests
CollectionDAOCompositionTest(verifies every moved@CreateSqlObjectaccessor stays visible + annotated — the mechanism JDBI relies on at runtime).Backend integration tests
openmetadata-servicesuite (5,800 tests) passes green and exercises the refactored DAO/repository layer end-to-end.Ingestion / Playwright tests
Manual testing performed
EXPLAINconfirms the percentile query now does an index-only scan; the PostgresisBotfix flips a bot from counted→excluded incountDailyActiveUsers.UI screen recording / screenshots:
Not applicable.
Checklist:
Fixes #28785above.entity_usageindex and PostgresisBotcolumn fix.🤖 Generated with Claude Code