Replace keepHistoryReasoning boolean with reasoningHistory

zikajk · zikajk · commit 8e3e3f247bc2 · 2026-01-24T17:28:46.000+01:00
Introduce a more granular control for reasoning retention in requests:
"all" (default, send everything)
"turn" (current turn only)
"off (discard all)

Both delta-reasoning (reasoning_content) and think-tag reasoning are handled uniformly.

DB storage is unaffected. Reasoning is always persisted for UI display.
This setting only controls what gets sent back to the model.
diff --git a/docs/configuration.md b/docs/configuration.md
@@ -694,7 +694,7 @@ To configure, add your OTLP collector config via `:otlp` map following [otlp aut
             models: {[key: string]: {
               modelName?: string;
               extraPayload?: {[key: string]: any};
-              keepHistoryReasoning?: boolean;
+              reasoningHistory?: "all" | "turn" | "off";
             }};
         }};
         defaultModel?: string;
diff --git a/docs/models.md b/docs/models.md
@@ -73,7 +73,7 @@ Schema:
 | `models`                              | map     | Key: model name, value: its config                                                                           | Yes      |
 | `models <model> extraPayload`         | map     | Extra payload sent in body to LLM                                                                            | No       |
 | `models <model> modelName`            | string  | Override model name, useful to have multiple models with different configs and names that use same LLM model | No       |
-| `models <model> keepHistoryReasoning` | boolean | Keep `reason` messages in conversation history. Default: `false`                                             | No       |
+| `models <model> reasoningHistory`     | string  | Controls reasoning in conversation history: `"all"` (default), `"turn"`, or `"off"`                          | No       |
 | `fetchModels`                         | boolean | Enable automatic model discovery from `/models` endpoint (OpenAI-compatible providers)                       | No       |
 
 _* url and key will be searched as envs `<provider>_API_URL` and `<provider>_API_KEY`, they require the env to be found or config to work._
@@ -121,32 +121,18 @@ Examples:
 
     This way both will use gpt-5 model but one will override the reasoning to be high instead of the default.
 
-=== "History reasoning"
-	`keepHistoryReasoning` - Determines whether the model's internal reasoning chain is persisted in the conversation history for subsequent turns.
+=== "Reasoning in conversation history"
+	`reasoningHistory` - Controls whether and how the model's reasoning (thinking blocks, reasoning_content) is included in conversation history sent to the model.
 
-	- **Standard Behavior**: Most models expect reasoning blocks (e.g., `<think>` tags or `reasoning_content`) to be removed in subsequent requests to save tokens and avoid bias.
-	- **Usage**: Enable this for models that explicitly support "preserved thinking," or if you want to experiment with letting the model see its previous thought process (with XML-based reasoning).
-	- **Example**: See [GLM-4.7 with Preserved thinking](https://docs.z.ai/guides/capabilities/thinking-mode#preserved-thinking).
+	**Available modes:**
 
-    ```javascript title="~/.config/eca/config.json"
-    {
-      "providers": {
-        "z-ai": {
-          "api": "openai-chat",
-          "url": "https://api.z.ai/api/paas/v4/",
-          "key": "your-api-key",
-          "models": {
-            "GLM-4.7": {
-              "keepHistoryReasoning": true,  // Preserves reasoning
-              "extraPayload": {"clear_thinking": false} // Preserved thinking (see https://docs.z.ai/guides/capabilities/thinking-mode#preserved-thinking)
-			  }
-          }
-        }
-      }
-    }
-    ```
+	- **`"all"`** (default, safe choice) - Send all reasoning blocks back to the model. The model can see its full chain of thought from previous turns. This is the safest option.
+	- **`"turn"`** - Send only reasoning from the current conversation turn (after the last user message). Previous reasoning is discarded before sending to the API.
+	- **`"off"`** - Never send reasoning blocks to the model. All reasoning is discarded before API calls.
+
+	**Note:** Reasoning is always shown to you in the UI and stored in chat history—this setting only controls what gets sent to the model in API requests.
 
-    Default: `false`.
+    Default: `"all"`.
 
 === "Dynamic model discovery"
 
diff --git a/integration-test/integration/chat/github_copilot_test.clj b/integration-test/integration/chat/github_copilot_test.clj
@@ -168,7 +168,7 @@
         (match-content chat-id "system" {:type "progress" :state "finished"})
         (is (match?
              {:input [{:role "user" :content [{:type "input_text" :text "hello!"}]}
-                      {:role "assistant" :content [{:type "output_text" :text "hello there!"}]}
+                      {:role "assistant" :content [{:type "output_text" :text "<think>I should say hello</think>\nhello there!"}]}
                       {:role "user" :content [{:type "input_text" :text "how are you?"}]}]
               :instructions (m/pred string?)}
              (llm.mocks/get-req-body :reasoning-1)))))))
diff --git a/integration-test/integration/chat/google_test.clj b/integration-test/integration/chat/google_test.clj
@@ -167,7 +167,7 @@
         (match-content chat-id "system" {:type "progress" :state "finished"})
         (is (match?
              {:input [{:role "user" :content [{:type "input_text" :text "hello!"}]}
-                      {:role "assistant" :content [{:type "output_text" :text "hello there!"}]}
+                      {:role "assistant" :content [{:type "output_text" :text "<thought>I should say hello</thought>\nhello there!"}]}
                       {:role "user" :content [{:type "input_text" :text "how are you?"}]}]
               :instructions (m/pred string?)}
              (llm.mocks/get-req-body :reasoning-1)))))))
diff --git a/src/eca/config.clj b/src/eca/config.clj
@@ -336,7 +336,8 @@
   {:kebab-case-key
    [[:providers]]
    :keywordize-val
-   [[:providers :ANY :httpClient]]
+   [[:providers :ANY :httpClient]
+    [:providers :ANY :models :ANY :reasoningHistory]]
    :stringfy-key
    [[:behavior]
     [:providers]
diff --git a/src/eca/llm_api.clj b/src/eca/llm_api.clj
@@ -97,6 +97,7 @@
         provider-config (get-in config [:providers provider])
         model-config (get-in provider-config [:models model])
         extra-payload (:extraPayload model-config)
+        reasoning-history (or (:reasoningHistory model-config) :all)
         [auth-type api-key] (llm-util/provider-api-key provider provider-auth config)
         api-url (llm-util/provider-api-url provider config)
         {:keys [handler]} (provider->api-handler provider config)
@@ -123,6 +124,7 @@
           :web-search web-search
           :extra-payload (merge {:parallel_tool_calls true}
                                 extra-payload)
+          :reasoning-history reasoning-history
           :api-url api-url
           :api-key api-key
           :auth-type auth-type}
@@ -157,6 +159,7 @@
           :tools tools
           :extra-payload (merge {:parallel_tool_calls true}
                                 extra-payload)
+          :reasoning-history reasoning-history
           :api-url api-url
           :api-key api-key
           :extra-headers {"openai-intent" "conversation-panel"
@@ -179,6 +182,7 @@
           :tools tools
           :think-tag-start "<thought>"
           :think-tag-end "</thought>"
+          :reasoning-history reasoning-history
           :extra-payload (merge {:parallel_tool_calls false}
                                 (when reason?
                                   {:extra_body {:google {:thinking_config {:include_thoughts true}}}})
@@ -206,7 +210,6 @@
         (let [url-relative-path (:completionUrlRelativePath provider-config)
               think-tag-start (:thinkTagStart provider-config)
               think-tag-end (:thinkTagEnd provider-config)
-              keep-history-reasoning (:keepHistoryReasoning model-config)
               http-client (:httpClient provider-config)]
           (handler
            {:model real-model
@@ -222,7 +225,7 @@
             :url-relative-path url-relative-path
             :think-tag-start think-tag-start
             :think-tag-end think-tag-end
-            :keep-history-reasoning keep-history-reasoning
+            :reasoning-history reasoning-history
             :http-client http-client
             :api-url api-url
             :api-key api-key}
diff --git a/src/eca/llm_providers/openai_chat.clj b/src/eca/llm_providers/openai_chat.clj
@@ -384,24 +384,27 @@
     (reset! reasoning-state* {:id nil :type nil :content "" :buffer ""})))
 
 (defn ^:private prune-history
-  "Discard reasoning messages from history.
-   Reasoning with :delta-reasoning? is preserved in the same turn (as required by Deepseek).
-   This corresponds to the implementation standard. However, it can be change it at the model level configuration.
+  "Discard reasoning messages from history based on reasoning-history mode.
+
    Parameters:
    - messages: the conversation history
-   - keep-history-reasoning: if true, preserve all reasoning in history"
-  [messages keep-history-reasoning]
-  (if keep-history-reasoning
-    messages
-    (if-let [last-user-idx (llm-util/find-last-user-msg-idx messages)]
-      (->> messages
-           (keep-indexed (fn [i m]
-                           (when-not (and (= "reason" (:role m))
-                                          (or (< i last-user-idx)
-                                              (not (get-in m [:content :delta-reasoning?]))))
-                             m)))
-           vec)
-      messages)))
+   - reasoning-history: controls reasoning retention
+     - :all  - preserve all reasoning in history (safe default)
+     - :turn - preserve reasoning only in the current turn (after last user message)
+     - :off  - discard all reasoning messages"
+  [messages reasoning-history]
+  (case reasoning-history
+    :all messages
+    :off (filterv #(not= "reason" (:role %)) messages)
+    :turn (if-let [last-user-idx (llm-util/find-last-user-msg-idx messages)]
+            (->> messages
+                 (keep-indexed (fn [i m]
+                                 (when-not (and (= "reason" (:role m))
+                                                (< i last-user-idx))
+                                   m)))
+                 vec)
+            messages)
+    messages))
 
 (defn chat-completion!
   "Primary entry point for OpenAI chat completions with streaming support.
@@ -411,14 +414,14 @@
    Compatible with OpenRouter and other OpenAI-compatible providers."
   [{:keys [model user-messages instructions temperature api-key api-url url-relative-path
            past-messages tools extra-payload extra-headers supports-image?
-           think-tag-start think-tag-end keep-history-reasoning http-client]}
+           think-tag-start think-tag-end reasoning-history http-client]}
    {:keys [on-message-received on-error on-prepare-tool-call on-tools-called on-reason on-usage-updated] :as callbacks}]
   (let [think-tag-start (or think-tag-start "<think>")
         think-tag-end (or think-tag-end "</think>")
         stream? (boolean callbacks)
         system-messages (when instructions [{:role "system" :content instructions}])
         ;; Pipeline: prune history -> normalize -> merge adjacent assistants -> filter
-        all-messages (prune-history (vec (concat past-messages user-messages)) keep-history-reasoning)
+        all-messages (prune-history (vec (concat past-messages user-messages)) reasoning-history)
         messages (vec (concat
                        system-messages
                        (normalize-messages all-messages supports-image? think-tag-start think-tag-end)))
@@ -478,7 +481,7 @@
                                        tool-calls))
         on-tools-called-wrapper (fn on-tools-called-wrapper [tools-to-call on-tools-called handle-response]
                                   (when-let [{:keys [new-messages]} (on-tools-called tools-to-call)]
-                                    (let [pruned-messages (prune-history new-messages keep-history-reasoning)
+                                    (let [pruned-messages (prune-history new-messages reasoning-history)
                                           new-messages-list (vec (concat
                                                                   system-messages
                                                                   (normalize-messages pruned-messages supports-image? think-tag-start think-tag-end)))
diff --git a/test/eca/llm_providers/openai_chat_test.clj b/test/eca/llm_providers/openai_chat_test.clj
@@ -259,7 +259,7 @@
            {:role "assistant" :reasoning_content "Thinking..."}])))))
 
 (deftest prune-history-test
-  (testing "Drops all reason messages before the last user message by default"
+  (testing "reasoningHistory \"turn\" drops all reason messages before the last user message"
     (is (match?
          [{:role "user" :content "Q1"}
           {:role "assistant" :content "A1"}
@@ -273,13 +273,14 @@
            {:role "user" :content "Q2"}
            {:role "reason" :content {:text "r2" :delta-reasoning? true}}
            {:role "assistant" :content "A2"}]
-          false))))
+          :turn))))
 
-  (testing "Preserves reason messages (without :delta-reasoning?) before last user message"
+  (testing "reasoningHistory \"turn\" also drops think-tag reasoning before last user message"
     (is (match?
          [{:role "user" :content "Q1"}
           {:role "assistant" :content "A1"}
           {:role "user" :content "Q2"}
+          {:role "reason" :content {:text "more thinking..."}}
           {:role "assistant" :content "A2"}]
          (#'llm-providers.openai-chat/prune-history
           [{:role "user" :content "Q1"}
@@ -288,9 +289,9 @@
            {:role "user" :content "Q2"}
            {:role "reason" :content {:text "more thinking..."}}
            {:role "assistant" :content "A2"}]
-          false))))
+          :turn))))
 
-  (testing "Preserves all reasoning when keep-history-reasoning is true (Bedrock)"
+  (testing "reasoningHistory \"all\" preserves all reasoning"
     (is (match?
          [{:role "user" :content "Q1"}
           {:role "reason" :content {:text "r1"}}
@@ -305,12 +306,35 @@
            {:role "user" :content "Q2"}
            {:role "reason" :content {:text "r2"}}
            {:role "assistant" :content "A2"}]
-          true))))
+          :all))))
 
-  (testing "No user message leaves list unchanged"
+  (testing "reasoningHistory \"off\" removes all reasoning messages"
+    (is (match?
+         [{:role "user" :content "Q1"}
+          {:role "assistant" :content "A1"}
+          {:role "user" :content "Q2"}
+          {:role "assistant" :content "A2"}]
+         (#'llm-providers.openai-chat/prune-history
+          [{:role "user" :content "Q1"}
+           {:role "reason" :content {:text "r1" :delta-reasoning? true}}
+           {:role "assistant" :content "A1"}
+           {:role "user" :content "Q2"}
+           {:role "reason" :content {:text "r2"}}
+           {:role "assistant" :content "A2"}]
+          :off))))
+
+  (testing "No user message - reasoningHistory \"turn\" leaves list unchanged"
     (let [msgs [{:role "assistant" :content "A"}
                 {:role "reason" :content {:text "r"}}]]
-      (is (= msgs (#'llm-providers.openai-chat/prune-history msgs false))))))
+      (is (= msgs (#'llm-providers.openai-chat/prune-history msgs :turn)))))
+
+  (testing "No user message - reasoningHistory \"off\" removes reason"
+    (is (match?
+         [{:role "assistant" :content "A"}]
+         (#'llm-providers.openai-chat/prune-history
+          [{:role "assistant" :content "A"}
+           {:role "reason" :content {:text "r"}}]
+          :off)))))
 
 (deftest valid-message-test
   (testing "Tool messages are always kept"