Python: add double-buffer context window reducer for chat history#13590
Open
marklubin wants to merge 1 commit intomicrosoft:mainfrom
Open
Python: add double-buffer context window reducer for chat history#13590marklubin wants to merge 1 commit intomicrosoft:mainfrom
marklubin wants to merge 1 commit intomicrosoft:mainfrom
Conversation
Proactive context compaction using double buffering: checkpoint at a configurable threshold (default 70%), continue working while summarization runs in the background, then swap to the pre-built back buffer at the swap threshold (default 95%). Includes stop-the-world fallback with configurable timeout, optional renewal policies (recurse/dump) for long-running sessions, and generation tracking via metadata.
9311712 to
ddf6cd7
Compare
Author
@microsoft-github-policy-service agree |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation and Context
Current chat history reduction strategies use stop-the-world summarization when context fills. This PR adds a
ChatHistoryDoubleBufferReducerthat proactively manages context through double buffering — beginning checkpoint summarization at a configurable threshold while the agent continues working, then swapping to the pre-built back buffer seamlessly.Reference: https://marklubin.me/posts/hopping-context-windows/
Description
Three-phase algorithm:
checkpoint_threshold(default 0.7), fire off background summarization viaasyncio.create_taskto seed the back buffer. The agent continues working immediately.add_message_async.swap_threshold(default 0.95), swap to back buffer. If checkpoint isn't done yet, block with configurable timeout (default 120s). If no back buffer exists, fall back to synchronous checkpoint (never skip compaction).Incremental summary accumulation across generations with
max_generationscontrol and renewal policies (RECURSEfor meta-summarization,DUMPfor clean restart).Contribution Checklist
AI Disclosure
This contribution was developed with assistance from Claude (Anthropic). All code was reviewed, tested, and validated by the author.