Skip to content

Comments

Python: add double-buffer context window reducer for chat history#13590

Open
marklubin wants to merge 1 commit intomicrosoft:mainfrom
marklubin:feat/double-buffer-history-reducer
Open

Python: add double-buffer context window reducer for chat history#13590
marklubin wants to merge 1 commit intomicrosoft:mainfrom
marklubin:feat/double-buffer-history-reducer

Conversation

@marklubin
Copy link

Motivation and Context

Current chat history reduction strategies use stop-the-world summarization when context fills. This PR adds a ChatHistoryDoubleBufferReducer that proactively manages context through double buffering — beginning checkpoint summarization at a configurable threshold while the agent continues working, then swapping to the pre-built back buffer seamlessly.

Reference: https://marklubin.me/posts/hopping-context-windows/

Description

Three-phase algorithm:

  1. Checkpoint (Phase 1): At checkpoint_threshold (default 0.7), fire off background summarization via asyncio.create_task to seed the back buffer. The agent continues working immediately.
  2. Concurrent (Phase 2): New messages go to both active and back buffers via add_message_async.
  3. Swap (Phase 3): At swap_threshold (default 0.95), swap to back buffer. If checkpoint isn't done yet, block with configurable timeout (default 120s). If no back buffer exists, fall back to synchronous checkpoint (never skip compaction).

Incremental summary accumulation across generations with max_generations control and renewal policies (RECURSE for meta-summarization, DUMP for clean restart).

Contribution Checklist

  • Code builds clean without errors or warnings
  • Follows SK Contribution Guidelines
  • All unit tests pass (19/19); new tests added
  • Did not break any existing functionality

AI Disclosure

This contribution was developed with assistance from Claude (Anthropic). All code was reviewed, tested, and validated by the author.

@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label Feb 25, 2026
Proactive context compaction using double buffering: checkpoint at a
configurable threshold (default 70%), continue working while
summarization runs in the background, then swap to the pre-built back
buffer at the swap threshold (default 95%).

Includes stop-the-world fallback with configurable timeout, optional
renewal policies (recurse/dump) for long-running sessions, and
generation tracking via metadata.
@marklubin marklubin force-pushed the feat/double-buffer-history-reducer branch from 9311712 to ddf6cd7 Compare February 25, 2026 07:40
@marklubin
Copy link
Author

@marklubin please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree

@marklubin marklubin marked this pull request as ready for review February 25, 2026 08:02
@marklubin marklubin requested a review from a team as a code owner February 25, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Pull requests for the Python Semantic Kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants