Skip to content

refactor: replace pipeline with layered app runtime#1

Merged
ArietidsZ merged 4 commits intomainfrom
codebase-architecture-redesign
Mar 10, 2026
Merged

refactor: replace pipeline with layered app runtime#1
ArietidsZ merged 4 commits intomainfrom
codebase-architecture-redesign

Conversation

@ArietidsZ
Copy link
Owner

Summary

  • replace the monolithic pipeline with package-level domain, application, infrastructure, and runtime composition layers
  • move FastAPI and core CLI commands onto shared use-case/runtime wiring while keeping thin root-level compatibility adapters
  • add layered regression tests, remove service_pipeline.py, and update docs/package metadata to match the new structure

Test Plan

  • pytest
  • ruff check .

Document the approved ports-and-use-cases end state so the codebase can refactor toward a stable, testable architecture.
Move API and CLI onto a shared composition root so runtime, media, RSS, and SQLite concerns flow through explicit use cases and adapters instead of a monolithic pipeline.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c73cce470

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +67 to +70
await self.process_source.execute(
cast(str, entry.source_url), entry.title
)
processed_count += 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Count processed items only after successful processing

This increments processed_count unconditionally whenever process=True, but ProcessSource.execute returns a Failure object on exceptions instead of raising, so failed jobs are still counted as processed. In practice, /ingest can report successful processing for entries that never produced a summary, which corrupts ingest metrics and can cause downstream automation to treat failed runs as healthy.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ee3b02a. IngestFeed.execute() now increments processed_count only when process_source.execute(...) does not return Failure, and tests/application/test_ingest_feed.py now covers the failure path.

Comment on lines +33 to +35
source_url=entry.source_url,
published_at=None,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve feed publication timestamps when saving feed items

Feed items are always persisted with published_at=None, which drops publication timestamps during ingest. Because RSS rendering only emits <pubDate> when published_at is set, ingested videos lose their item dates in generated feeds, degrading freshness/ordering signals for RSS clients that rely on publication time.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ee3b02a. FetchedFeedEntry now carries published_at, HttpFeedSource parses item publication dates, and SQLiteFeedVideoRepository persists them so downstream RSS publication reads retain pubDate. Added regression coverage in the feed-source and SQLite repository tests.

Keep ingest metrics and publication metadata accurate across review feedback, and make the affected regression tests platform-safe so CI reflects real behavior.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee3b02aa00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +45 to +46
published_parsed = entry.get("published_parsed")
if published_parsed is None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve updated timestamps when published is absent

_pick_published_at only checks published/published_parsed, so feeds that expose timestamps via updated/updated_parsed (common in Atom) now lose item dates during ingest. That causes generated RSS items to omit <pubDate> for those sources, which can break chronological ordering and freshness signals in RSS readers.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 689c760. feed_source._pick_published_at(...) now falls back to updated / updated_parsed when published fields are absent, and tests/infrastructure/test_feed_source.py now covers the Atom updated path.

if published_parsed is None:
return None

return datetime(*published_parsed[:6], tzinfo=timezone.utc)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard published_parsed conversion against malformed dates

The direct call to datetime(*published_parsed[:6], ...) can raise TypeError/ValueError for malformed or partial date tuples, and that exception currently bubbles out of feed ingestion. In practice, a single bad timestamp in one entry can fail the whole /ingest request instead of just dropping that entry’s publication date.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 689c760. Parsed tuple conversion is now guarded in _parse_datetime_tuple(...), so malformed published_parsed / updated_parsed values are ignored instead of failing the whole ingest. Added regression coverage in tests/infrastructure/test_feed_source.py.

Preserve item dates from updated timestamps and ignore malformed parsed tuples so feed ingestion keeps working across mixed RSS and Atom metadata.
@ArietidsZ ArietidsZ merged commit bc857c1 into main Mar 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant