You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GET /api/projects/<name>/sessions calls get_cached_session() on every .jsonl file —
a full JSONL parse — to build the title, models, tokens, tool_calls, and timestamps
shown on the session list page. The in-memory LRU (200 entries, mtime-keyed) eliminates
re-parses within a server run, but is volatile: after any restart all entries are cold. On a
large install (300+ sessions, 20+ projects), the first request to each project page after
startup triggers N full parses. There is no disk-backed summary that survives restart.
The landing page endpoint runs two expensive operations on every call with no caching:
_get_display_name() in utils/session_path.py → list_projects(): opens the first .jsonl file in each project directory to read the cwd field and derive a human-readable
display name. On 20 projects that is 20 file opens per landing page load. The cwd field is
set at session creation and never changes; the display name is deterministic for a given
project directory and stable until a new .jsonl file arrives.
quick_session_info() loop in api/projects.py → get_projects(): peeks at every session
file to compute titled_count and latest_ts for each project card. These calls happen on
every page load even though the results are stable until a file's mtime changes.
Gap 3 — Count mismatch between project cards and session list (cppa-cursor-browser #95 analog).
GET /api/projects computes session_count via quick_session_info() peek without applying
exclusion rules. GET /api/projects/<name>/sessions applies exclusion rules through is_session_excluded(), which scans all session["messages"] for content matching. When
exclusion rules are active, the count on the project card does not match the number of rows
returned on the session list page for the same project — the same count-alignment regression
that affected cppa before cppa-cursor-browser #95 / PR #113.
Root cause summary
Path
Per-call cost
Cached today
GET /api/projects → _get_display_name()
1 file open per project
Never
GET /api/projects → quick_session_info()
1 head+tail read per session
Never
GET /api/projects/<name>/sessions → get_cached_session()
Full JSONL parse per file
In-memory LRU only (volatile)
Exclusion check
Reads all session["messages"]
Never (always re-computed)
is_session_excluded() requires full parse on first access, but exclusion rules are loaded once
at startup and never change within a run — the result is deterministic for a given (path, mtime, rules_fingerprint) triple and safely cacheable on disk.
Goal
One merged PR that adds:
A disk-backed session summary cache for session list rows, so session listing survives
server restart without re-parsing unchanged files.
A display-name cache per project directory, so list_projects() does not reopen .jsonl
files on every call.
Routes GET /api/projects through the same summary cache for per-session peek data, so the
landing page does not re-read session files when mtimes have not changed.
Aligned session counts between project cards and the session list page for the
no-exclusion-rules case.
Scope
Touch points
utils/session_summary_cache.py (new) — disk SQLite summary cache
Stores session list rows keyed to (abs_path, mtime) for no-exclusion fast path, and to (abs_path, mtime, rules_fingerprint) for exclusion-aware entries.
Before opening any .jsonl file, compute max_mtime = max(mtime of *.jsonl files).
If _display_name_cache.get(project_dir) returns a hit with the same max_mtime, return the
cached display name without any file I/O.
On miss: open file as today, update cache entry. Reset entry when max_mtime changes.
threading.Lock for cache dict access.
This is the claude analog of cppa-cursor-browser #116: _get_display_name() re-reads the same stable field
on every request; caching it per directory fingerprint eliminates per-request file opens.
api/projects.py → get_projects() — use summary cache for peek data
Replace the quick_session_info() loop with a pass that consults session_summary_cache for is_untitled and last_timestamp when a cache entry exists for (path, mtime).
On cache miss (first access or mtime changed): call quick_session_info() as today for
title/timestamps; write a partial summary row (without token counts) to the disk cache so
the next get_project_sessions() call can upgrade it to a full row.
session_count on project cards uses the same titled-session filter (skip is_untitled) as get_project_sessions(), fixing the count for the no-exclusion case.
api/projects.py → get_project_sessions() — use summary cache for session rows
Compute rules_fingerprint = stable hash of serialised rules list (consistent within a run).
For each session: check get_summary(path, mtime, rules_fingerprint).
Cache hit: return the cached row, skip get_cached_session().
Cache miss: full parse via get_cached_session(), apply exclusion, call put_summary(),
proceed as today.
Spy on utils.session_cache.get_cached_session: verify it is not called on a warm summary
cache hit in GET /api/projects/<name>/sessions.
Verify project card session_count matches session list row count when no exclusion rules.
Out of scope
FTS search index for GET /api/search — tracked in PR 2
(chen-july-week2-monday-search-fts-ux-and-errors-github-issue.md; cppa: PR #113).
ThreadPoolExecutor parallel session parsing — deferred until benchmarks justify it.
Changing exclusion rule semantics or load timing.
Caching full SessionDict (messages) to disk — only summary rows; in-memory LRU from claude-code-chat-browser #82
remains the full-session cache for deep-link and export paths.
Follow-up (post-merge)
Wire summary cache into GET /api/search so search skips full re-parses for files already
in the disk cache (title and timestamp pre-filter from summary).
Add tests/benchmarks/test_session_list_bench.py measuring cold-start session list latency
with and without disk cache.
GET /api/projects/<name>/sessions does not call get_cached_session() for files whose (path, mtime, rules_fingerprint) entry is in the disk summary cache.
list_projects() does not reopen .jsonl files on repeat calls when the project
directory's max file mtime has not changed (display-name cache hit).
GET /api/projects does not call quick_session_info() for session files already in the
summary cache.
After a server restart and one warm-up request, a second request to GET /api/projects/<name>/sessions triggers zero parse_session calls for unchanged files
(verified by spy in integration test).
Project card session_count equals GET /api/projects/<name>/sessions row count when
no exclusion rules are active.
Cache entries invalidate on mtime change — an edited file is re-parsed on next request.
Start python app.py with 50+ sessions in one project.
Open the project's session list page — note cold response time.
Stop and restart the server.
Open the same project page — response should be comparable to warm (disk cache hit, no
re-parses in server log).
touch one .jsonl file — that file re-parses on next request; all others remain cached.
Reload the landing page repeatedly — server log shows no .jsonl file opens for projects
whose directories have not changed.
Confirm project card count matches session list row count (no exclusion rules active).
References
cppa-cursor-browser #95 / PR #113: workspace listing slow
on large installs — summary_cache.py disk cache and count alignment. team-brain/2026-06/2026-06-23/brad/cppa-cursor-browser PR113 review 2026-06-23.md
cppa-cursor-browser #116 / PR #125: infer_invalid_workspace_aliases() re-computed per request — cache per storage fingerprint.
Direct analog: _get_display_name() and quick_session_info() loop.
Exclusion constraint: is_session_excluded() → session_text_for_exclusion() iterates all session["messages"]; result is cacheable per (path, mtime, rules_fingerprint) because
rules are stable within a process run.
Chen (@clean6378-max-it)
Calendar Day
Friday, July 4, 2026 (PR 1 of 2)
Planned Effort
5 story points (Medium–High)
Companion PR: Mon PR 2 (FTS index + search UX + API errors) — independent files; PR 1 should land first.
Problem
Three related performance and correctness gaps remain after issue
#82 (in-memory LRU cache):
Gap 1 — Session list cold on every restart (cppa-cursor-browser #95 analog).
GET /api/projects/<name>/sessionscallsget_cached_session()on every.jsonlfile —a full JSONL parse — to build the
title,models,tokens,tool_calls, andtimestampsshown on the session list page. The in-memory LRU (200 entries, mtime-keyed) eliminates
re-parses within a server run, but is volatile: after any restart all entries are cold. On a
large install (300+ sessions, 20+ projects), the first request to each project page after
startup triggers N full parses. There is no disk-backed summary that survives restart.
Gap 2 — Per-request re-computation in
GET /api/projects(cppa-cursor-browser #116 analog).The landing page endpoint runs two expensive operations on every call with no caching:
_get_display_name()inutils/session_path.py → list_projects(): opens the first.jsonlfile in each project directory to read thecwdfield and derive a human-readabledisplay name. On 20 projects that is 20 file opens per landing page load. The
cwdfield isset at session creation and never changes; the display name is deterministic for a given
project directory and stable until a new
.jsonlfile arrives.quick_session_info()loop inapi/projects.py → get_projects(): peeks at every sessionfile to compute
titled_countandlatest_tsfor each project card. These calls happen onevery page load even though the results are stable until a file's mtime changes.
Gap 3 — Count mismatch between project cards and session list (cppa-cursor-browser #95 analog).
GET /api/projectscomputessession_countviaquick_session_info()peek without applyingexclusion rules.
GET /api/projects/<name>/sessionsapplies exclusion rules throughis_session_excluded(), which scans allsession["messages"]for content matching. Whenexclusion rules are active, the count on the project card does not match the number of rows
returned on the session list page for the same project — the same count-alignment regression
that affected cppa before cppa-cursor-browser #95 / PR #113.
Root cause summary
GET /api/projects→_get_display_name()GET /api/projects→quick_session_info()GET /api/projects/<name>/sessions→get_cached_session()session["messages"]is_session_excluded()requires full parse on first access, but exclusion rules are loaded onceat startup and never change within a run — the result is deterministic for a given
(path, mtime, rules_fingerprint)triple and safely cacheable on disk.Goal
One merged PR that adds:
server restart without re-parsing unchanged files.
list_projects()does not reopen.jsonlfiles on every call.
GET /api/projectsthrough the same summary cache for per-session peek data, so thelanding page does not re-read session files when mtimes have not changed.
no-exclusion-rules case.
Scope
Touch points
utils/session_summary_cache.py(new) — disk SQLite summary cache(abs_path, mtime)for no-exclusion fast path, and to(abs_path, mtime, rules_fingerprint)for exclusion-aware entries.title,models,tokens,tool_calls,first_timestamp,last_timestamp,is_excluded(bool),is_untitled(bool derived from title).~/.claude-code-chat-browser/session_summary_cache.sqlite(same directory asexclusion rules file).
threading.Lockfor writer safety; read-only mode for concurrent readers.get_summary(path, mtime, rules_fingerprint) -> SummaryRowDict | None,put_summary(path, mtime, rules_fingerprint, row),clear_cache().utils/session_path.py → list_projects()/_get_display_name()— display-name cache_display_name_cache: dict[str, tuple[float, str]]— key:project_dir;value:
(max_mtime_of_jsonl_files, display_name)..jsonlfile, computemax_mtime = max(mtime of *.jsonl files)._display_name_cache.get(project_dir)returns a hit with the same max_mtime, return thecached display name without any file I/O.
threading.Lockfor cache dict access._get_display_name()re-reads the same stable fieldon every request; caching it per directory fingerprint eliminates per-request file opens.
api/projects.py → get_projects()— use summary cache for peek dataquick_session_info()loop with a pass that consultssession_summary_cacheforis_untitledandlast_timestampwhen a cache entry exists for(path, mtime).quick_session_info()as today fortitle/timestamps; write a partial summary row (without token counts) to the disk cache so
the next
get_project_sessions()call can upgrade it to a full row.session_counton project cards uses the same titled-session filter (skipis_untitled) asget_project_sessions(), fixing the count for the no-exclusion case.api/projects.py → get_project_sessions()— use summary cache for session rowsrules_fingerprint = stable hash of serialised rules list(consistent within a run).get_summary(path, mtime, rules_fingerprint).get_cached_session().get_cached_session(), apply exclusion, callput_summary(),proceed as today.
tests/test_session_summary_cache.py(new)rules_fingerprint→ separate rows), LRU eviction,clear_cache().tests/test_api_integration.py— extendutils.session_cache.get_cached_session: verify it is not called on a warm summarycache hit in
GET /api/projects/<name>/sessions.session_countmatches session list row count when no exclusion rules.Out of scope
GET /api/search— tracked in PR 2(
chen-july-week2-monday-search-fts-ux-and-errors-github-issue.md; cppa:PR #113).
ThreadPoolExecutorparallel session parsing — deferred until benchmarks justify it.SessionDict(messages) to disk — only summary rows; in-memory LRU fromclaude-code-chat-browser #82
remains the full-session cache for deep-link and export paths.
Follow-up (post-merge)
GET /api/searchso search skips full re-parses for files alreadyin the disk cache (title and timestamp pre-filter from summary).
tests/benchmarks/test_session_list_bench.pymeasuring cold-start session list latencywith and without disk cache.
Acceptance Criteria
utils/session_summary_cache.pyexists: SQLite-backed,(path, mtime, rules_fingerprint)composite key,
get_summary/put_summary/clear_cachepublic API.GET /api/projects/<name>/sessionsdoes not callget_cached_session()for files whose(path, mtime, rules_fingerprint)entry is in the disk summary cache.list_projects()does not reopen.jsonlfiles on repeat calls when the projectdirectory's max file mtime has not changed (display-name cache hit).
GET /api/projectsdoes not callquick_session_info()for session files already in thesummary cache.
GET /api/projects/<name>/sessionstriggers zeroparse_sessioncalls for unchanged files(verified by spy in integration test).
session_countequalsGET /api/projects/<name>/sessionsrow count whenno exclusion rules are active.
tests/test_session_summary_cache.pypasses: hit, miss, invalidation, exclusion-keyseparation, eviction,
clear_cache().mypy --strict, fullpytest, andruffpass.Verification
Manual smoke test:
python app.pywith 50+ sessions in one project.re-parses in server log).
touchone.jsonlfile — that file re-parses on next request; all others remain cached..jsonlfile opens for projectswhose directories have not changed.
References
PR #113: workspace listing slow
on large installs —
summary_cache.pydisk cache and count alignment.team-brain/2026-06/2026-06-23/brad/cppa-cursor-browser PR113 review 2026-06-23.mdPR #125:
infer_invalid_workspace_aliases()re-computed per request — cache per storage fingerprint.Direct analog:
_get_display_name()andquick_session_info()loop.claude-code-chat-browser #82
(this PR adds the disk tier on top).
is_session_excluded()→session_text_for_exclusion()iterates allsession["messages"]; result is cacheable per(path, mtime, rules_fingerprint)becauserules are stable within a process run.
api/projects.py,utils/session_path.py,utils/session_cache.py,utils/session_peek.py,utils/exclusion_rules.py,models/project.py.chen-july-week2-monday-search-fts-ux-and-errors-github-issue.mdPR 2.