This document describes the Python backend architecture and the contracts we rely on for:
- Distributing business logic and state management consistently.
- Keeping the codebase readable and easy to navigate.
- Preserving high-confidence testability through integration-style tests.
At runtime, the backend is a local FastAPI server. Endpoints are intentionally thin and delegate all work to a
single, shared AppHandler instance which owns:
RuntimeConfig: constant environment configuration (treat as immutable).AppState: a centralized, typed, mutable state object.- A shared lock used for safe state access/mutation under concurrency.
- A collection of domain-specific sub-handlers (the “business logic modules”).
High-level request flow:
HTTP request
-> _routes/* (FastAPI endpoint function; minimal logic)
-> AppHandler (injected via FastAPI Depends)
-> handlers/* (domain logic + state transitions)
-> services/* (side effects boundary: GPU, IO, network, etc.)
-> state/* (AppState read/mutation under a shared lock)
ltx2_server.pyis the runtime composition root:- Builds
RuntimeConfig. - Builds the
AppHandlerviabuild_initial_state(...). - Creates the FastAPI app via
create_app(handler=...). - Starts
uvicorn.
- Builds
app_factory.pyis the FastAPI app factory (importable from tests):- Registers exception handlers and CORS.
- Includes routers from
_routes/. - Calls
state.init_state_service(handler)so routes can depend on the sharedAppHandler.
state/deps.pyholds the FastAPI dependency hook:get_state_service()returns the sharedAppHandler.- Tests override it via
set_state_service_for_tests(...).
Routes live in _routes/ and define the HTTP API (paths, request parsing, response models).
Contract
- Routes should be “plumbing”: parse typed inputs, call one handler method, return a typed output.
- Business logic and state mutation belong in
handlers/, not in_routes/. - Requests/responses should be strictly typed via Pydantic models from
api_types.py.
Example pattern (using the stable health endpoint):
@router.get("/health", response_model=HealthResponse)
def route_health(handler: AppHandler = Depends(get_state_service)) -> HealthResponse:
return handler.health.get_health()AppHandler (in app_handler.py) is the single object routes depend on. It is responsible for:
- Owning the single
AppStateinstance (centralized mutable state). - Owning a single shared lock used to protect state access/mutation.
- Wiring sub-handlers and injecting them with:
state+lockRuntimeConfigwhere relevant- Services required for side effects
AppHandler holds sub-handlers in composition (e.g. handler.health, handler.models, handler.downloads, ...).
Each sub-handler is responsible for one cohesive domain of operations.
To keep tests lightweight and imports safe:
build_default_service_bundle(...)performs lazy imports of heavy runtime implementations.- Tests pass a
ServiceBundlecontaining fakes instead of importing GPU/network implementations.
AppState lives in state/app_state_types.py and is the canonical model of mutable runtime state.
Design goals
- Keep the state “normalized”: represent important state machines explicitly via unions rather than loose dicts.
- Maximize static integrity: prefer
Literal,Enum,Protocol, and union types over dynamic runtime checks. - Minimize runtime dependencies from state types (use
TYPE_CHECKING+ lightweight structural types where needed).
In practice, a large portion of the state is about limited resource management (e.g. downloads, which pipeline owns the GPU, which generation is running, etc.).
Example: a state machine represented as a union of small dataclasses:
GenerationState = GenerationRunning | GenerationComplete | GenerationError | GenerationCancelledThis enables exhaustive matching and makes illegal states harder to represent.
This backend is optimized for a single local client with heavy requests (GPU/CPU work), rather than a high-QPS, multi-tenant server.
Most endpoints are defined as synchronous def route handlers, which FastAPI/Starlette executes via a thread pool.
That means multiple requests can overlap in time even in a “single client” setting (e.g. progress polling, cancels,
downloads, settings updates, etc.).
All state access/mutation must be done with extra care. The shared lock exists to prevent race conditions and torn reads/writes.
Rules of thumb
- Any
AppStateread/write that influences decisions should happen under the shared lock (especially read-modify-write). - Prefer handler methods decorated with
handlers.base.with_state_lockfor consistent locking. - Do not hold the lock while doing heavy compute or slow IO.
The most fragile point when implementing handlers is choosing the locking scope correctly:
- Locking too little risks inconsistent state transitions.
- Locking too much can serialize the whole server and block other endpoints for long periods.
When heavy work is involved, prefer this pattern:
- Lock → read/validate state + compute a small “plan” / snapshot.
- Unlock → perform heavy compute / IO using the snapshot.
- Lock → re-check that the state is still compatible, then apply mutations.
Never assume the state stayed the same across lock → heavy work → lock.
Handlers live in handlers/. They are the primary home for:
- High-level business logic.
- State transitions and resource management.
- Coordinating services (side effects) in a testable way.
Contract
- Handlers may mutate
AppState(with locking). - Routes should not mutate
AppStatedirectly. - Handlers should not “fake” side effects; they should call services.
Some side effects are not suitable for integration tests (GPU-heavy compute, network IO, etc.). We isolate those behind
services in services/.
Contract
- Services are the boundary between:
- The runtime app (real implementations), and
- The tested app (fake implementations).
- Services should be narrowly scoped to their side effect and avoid business/state logic.
- If a heavy side effect must be avoided in tests, it should be avoided only by introducing (or using) a service.
Each service should have:
- A Protocol interface (e.g.
HTTPClient,GpuInfo,FastVideoPipeline) - A real runtime implementation:
- Use a concrete name when coupled to a specific implementation (e.g.
LTXFastVideoPipeline) - Otherwise use an
*Implsuffix (e.g.HTTPClientImpl)
- Use a concrete name when coupled to a specific implementation (e.g.
- A fake implementation for tests:
Fake<ServiceName>(e.g.FakeHTTPClient)
- Report/DTO-like shapes commonly use a
*Payloadsuffix (oftenTypedDict) to make “this is a data payload” obvious (e.g.GpuTelemetryPayload,VideoInfoPayload). - To avoid heavy import dependencies, structural wrappers use a
*Likesuffix (e.g.HttpResponseLike,VideoCaptureLike).
The backend uses a boundary-owned traceback policy to avoid duplicate stack traces and fragile per-handler decisions.
app_factory.pyowns request exception logging via centralized helpers inlogging_policy.py.HTTPErrorwith status4xxis logged as message-only (no traceback).HTTPErrorwith status5xxis logged with full traceback.- Unhandled
Exceptionis logged with full traceback. - Handlers should not call
logger.exception(...)and then rethrow to the request boundary.
ThreadingRunnerowns uncaught background exception logging vialogging_policy.log_background_exception(...).- Background callers should pass
on_errorcallbacks for state transitions and avoid duplicate traceback logging.
- When converting one exception into another for propagation, use exception chaining:
raise HTTPError(500, "...") from exc
- This preserves causal stacks while keeping traceback emission centralized.
Request handler pattern:
try:
do_work()
except Exception as exc:
raise HTTPError(500, "Operation failed") from excBackground task pattern:
task_runner.run_background(
target=worker,
task_name="model-download",
on_error=lambda exc: set_error_state(str(exc)),
)The backend aims for maximal coverage via integration-style tests in tests/:
- Tests create a real FastAPI app via
create_app(handler=...). - Tests inject fakes via
ServiceBundlewhen building theAppHandler. - Tests call real routes using
TestClientand assert on final outcomes.
Contract
- Do not mock/patch routes,
AppHandler, or handlers. - Fake only through services (by swapping service implementations).
- Prefer “behavioral” fakes that mimic the real contract in a lightweight way over
MagicMock-style call assertions.
Primary entities:
ltx2_server.py: runtime bootstrap (logging,RuntimeConfig,AppHandler,uvicorn)app_factory.py: FastAPI app factory (routers, DI init, exception handling)_routes/*: endpoint definitions (API surface)api_types.py: Pydantic request/response models (API typing contract)runtime_config/: immutable-ish runtime configuration models and constantsstate/:AppStateand DI helpers (get_state_service, test overrides)app_handler.py:AppHandlercomposition root + service wiring (ServiceBundle)handlers/*: domain business logic + state transitions (sub-handlers)services/*: side-effect services (protocols + real implementations)tests/*: integration-style tests + service fakes
Modularity convention: prefer many small files over umbrella modules (one route per _routes/*.py, one handler per
handlers/*_handler.py, one service per services/<service>/...).
- Define request/response models in
api_types.py(avoidAny/dynamic dicts when possible). - Add/extend an endpoint in
_routes/<domain>.pythat delegates tohandler.<domain>. - Implement the domain logic in
handlers/<domain>_handler.py.- Use the shared lock for state interactions.
- Keep lock scope small around heavy work.
- If you need a new heavy side effect, add a new service in
services/and inject it viaServiceBundle. - Add an integration-style test in
tests/using a fake service implementation (no mocking/patching). - Does this change follow boundary-owned traceback logging policy?