feat: add thermodynamic regime management (T* framework)#1444
feat: add thermodynamic regime management (T* framework)#1444Nietzsche-Ubermensch wants to merge 7 commits intoMoonshotAI:mainfrom
Conversation
Implements canonical T* = (L - γ) / (|L| + λ) oversight: - Auto ACT/HOLD/REFUSE classification - Auto-grounding in HOLD regime - Circuit breaker in REFUSE regime - Benchmark mode with Moonshot canonical params Refs: Memory ID 28 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Signed-off-by: Nietzsche-Ubermensch <peterbilt5018@gmail.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Signed-off-by: Nietzsche-Ubermensch <peterbilt5018@gmail.com>
| if cache_key in self.cache: | ||
| return {**self.cache[cache_key], "cached": True, "cost": 0.0} |
There was a problem hiding this comment.
🟡 Cached results return stale total_spent and budget_remaining values
In dynamic_complete.py:67-68, when a cached result is returned, the entire cached dict is spread and returned with only cached and cost overridden. However, the cached dict contains the total_spent and budget_remaining values from the time the result was originally computed and cached (line 170-175). After subsequent non-cached queries increase self.registry.total_spent, the cached result will return outdated financial tracking values, misleading the caller about actual spending.
| if cache_key in self.cache: | |
| return {**self.cache[cache_key], "cached": True, "cost": 0.0} | |
| if cache_key in self.cache: | |
| return {**self.cache[cache_key], "cached": True, "cost": 0.0, "total_spent": self.registry.total_spent, "budget_remaining": 200.0 - self.registry.total_spent} |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5790097fa
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| [project.scripts] | ||
| kimi = "kimi_cli.cli:cli" | ||
| kimi-cli = "kimi_cli.cli:cli" | ||
| kimi-thermo = "kimi_thermo.main:main" |
There was a problem hiding this comment.
Package kimi_thermo before exposing kimi-thermo entrypoint
This adds kimi-thermo = "kimi_thermo.main:main", but the same file still configures the build backend with module-name = ["kimi_cli"], so installed wheels can expose the new console script without actually shipping the kimi_thermo package. In that installed-artifact scenario (as opposed to running from a source checkout), invoking kimi-thermo will fail with ModuleNotFoundError, making the new command unusable for users.
Useful? React with 👍 / 👎.
| result = await self.registry.execute(name, args) | ||
| tools_used.append(name) | ||
|
|
||
| # Add to response context | ||
| full_response += f"\n[{name} result: {result[:100]}...]\n" |
There was a problem hiding this comment.
Continue the chat after tool execution
When a streamed tool call arrives, the code executes the tool and appends a local snippet to full_response, but it never sends the tool result back to the model in a follow-up completion turn. For prompts that trigger any tool (web search, convert, code_runner, etc.), this can terminate output at raw tool snippets instead of a final assistant answer because the model never receives the tool output in-message.
Useful? React with 👍 / 👎.
src/kimi_thermo/dynamic_complete.py
Outdated
| cost += 0.002 * (len(full_response) / 1000) # Approximate token cost | ||
|
|
||
| # Update spent | ||
| self.registry.total_spent += cost |
There was a problem hiding this comment.
Avoid charging tool calls twice in spend tracking
This method adds cost into self.registry.total_spent after the stream, but each tool execution has already incremented total_spent inside CompleteToolRegistry.update_from_result (src/kimi_thermo/tools_complete.py, line 406). Because cost already includes per-tool charges, requests that use tools get double-billed in budget accounting, overstating spend and understating remaining budget in audit output.
Useful? React with 👍 / 👎.
Adds complete project documentation for Claude Code CLI: - Project overview and workspace structure - Technology stack and architecture patterns - Common commands for development and testing - Environment specifics and Windows compatibility - MCP/ACP integration guides - IDE integration instructions - Thermodynamic framework documentation Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c4c40003a5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| cache_key = query | ||
| if cache_key in self.cache: |
There was a problem hiding this comment.
Include execution mode in cache key
execute() supports different runtime profiles (utility/fast/deep), but the cache key only uses the raw query text, so running the same prompt in a different mode will silently return the previous mode’s cached output instead of applying the requested configuration. This breaks mode-specific behavior and can skew any benchmark/comparison runs that rely on switching modes.
Useful? React with 👍 / 👎.
Nietzsche-Ubermensch
left a comment
There was a problem hiding this comment.
docs: add comprehensive CLAUDE.md documentation
Adds complete project documentation for Claude Code CLI:
- Project overview and workspace structure
- Technology stack and architecture patterns
- Common commands for development and testing
- Environment specifics and Windows compatibility
- MCP/ACP integration guides
- IDE integration instructions
- Thermodynamic framework documentation
Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com
Nietzsche-Ubermensch
left a comment
There was a problem hiding this comment.
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Signed-off-by: Nietzsche-Ubermensch peterbilt5018@gmail.com
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Signed-off-by: Nietzsche-Ubermensch <peterbilt5018@gmail.com>
| """Zero user work. System manages T* regime.""" | ||
|
|
||
| def __init__(self): | ||
| self.api_key = os.getenv("MOONSHOT_API_KEY") |
There was a problem hiding this comment.
🟡 Missing API key validation in WorklessClient leads to "Bearer None" auth header
In thermo_executor.py:49, self.api_key = os.getenv("MOONSHOT_API_KEY") can be None when the env var is unset. Unlike DynamicCompleteClient (which raises ValueError immediately at dynamic_complete.py:22-23), WorklessClient passes None through, resulting in headers={"Authorization": "Bearer None"} at line 54. This sends the literal string "Bearer None" to the API, producing a confusing authentication error instead of a clear message. Since main.py:15 directly instantiates WorklessClient(), users of kimi-thermo will hit this opaque failure.
| self.api_key = os.getenv("MOONSHOT_API_KEY") | |
| self.api_key = os.getenv("MOONSHOT_API_KEY") | |
| if not self.api_key: | |
| raise ValueError("Set MOONSHOT_API_KEY environment variable") |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Implements canonical thermodynamic oversight (T* framework) for Kimi CLI tool execution.
Changes
Thermodynamic Rationale
Prevents the "prompt engineering parasite" (Memory ID 28) by making the system self-regulate coherence instead of extracting user labor.
Usage
Checklist