fix(reasoning): detect short-form "READY" when concluding a plan by i-anubhav-anand · Pull Request #6206 · crewAIInc/crewAI

i-anubhav-anand · 2026-06-17T20:16:04Z

Summary

Fixes #6204. When an agent uses reasoning/planning, the executor asks the model to conclude its plan with a readiness signal and re-plans while "not ready". The model frequently replies with a bare READY, but the parser only recognised the full sentence READY: I am ready to execute the task. — so a ready model was read as not ready and the agent kept refining the plan (indefinitely when max_attempts is None).

The mismatch is between the prompts and the parser:

Prompt (`translations/en.json`)	What it asks the model to output
`reasoning.create_plan_prompt`	`"READY: I am ready to execute the task."` (full sentence)
`reasoning.refine_plan_prompt`	`Conclude with READY or NOT READY.` (bare token)
`planning.create_plan_prompt`	`After your plan, state READY or NOT READY.` (bare token)
`planning.refine_plan_prompt`	`Conclude with READY or NOT READY as before.` (bare token)

The readiness check only matched the full sentence:

# reasoning_handler.py — before
ready = "READY: I am ready to execute the task." in response

So three of the four prompts could never produce a "ready" verdict. In _refine_plan_if_needed, while not ready and (max_attempts is None or attempt < max_attempts) then loops — wasted refine calls, or no termination at all with the default max_attempts=None.

Fix

Add AgentReasoning._check_ready, used by _parse_planning_response and the two _call_with_function fallback paths. It accepts both forms and inspects only the last meaningful line for the short form, so a plan body that merely mentions the word "ready" isn't misread, and NOT READY is never matched:

@staticmethod
def _check_ready(response: str) -> bool:
    if "READY: I am ready to execute the task." in response:
        return True
    for line in reversed(response.splitlines()):
        conclusion = line.strip().strip("*`#_-.!:> ").upper()
        if not conclusion:
            continue
        return conclusion == "READY"
    return False

Surrounding markdown emphasis/punctuation (**READY**, ## READY, READY.) is stripped and pure-separator lines (---) are skipped, since real model output (the report used an Ollama GLM model) is rarely a clean bare token.

Test plan

New TestCheckReady (15 cases) in tests/utilities/test_structured_planning.py covering full-sentence and short forms, markdown/heading/separator variants, and the critical negatives (NOT READY, ALREADY DONE, body-only "ready").

Fail-before / pass-after (source stashed, tests kept):

Before (origin/main): 13 failed, 2 passed — e.g. test_parse_planning_response_short_ready fails assert False is True.
After: 15 passed.

Full file: 38 passed; the 3 remaining failures are pre-existing optional-provider import errors (azure/anthropic/gemini native providers not installed) unrelated to this change. ruff check clean.

Summary by CodeRabbit

Bug Fixes
- Improved agent readiness detection to reliably interpret multiple “READY/NOT READY” response formats (including markdown, headings, and punctuation) while reducing false positives from unrelated text.
- Updated structured planning parsing to use the same readiness detection logic for consistent ready behavior.
Tests
- Added unit tests covering common and edge-case planning response formats, including empty responses and “ALREADY DONE” negative scenarios.

coderabbitai · 2026-06-17T20:16:31Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ff00e917-6424-494d-804b-2e2460e27c04

📥 Commits

Reviewing files that changed from the base of the PR and between 4f33415 and 119d246.

📒 Files selected for processing (2)

lib/crewai/src/crewai/utilities/reasoning_handler.py
lib/crewai/tests/utilities/test_structured_planning.py

🚧 Files skipped from review as they are similar to previous changes (2)

lib/crewai/tests/utilities/test_structured_planning.py
lib/crewai/src/crewai/utilities/reasoning_handler.py

📝 Walkthrough

Walkthrough

Replaces hardcoded substring checks for agent readiness with a new AgentReasoning._check_ready() static method that handles full-sentence and short-form READY/NOT READY outputs, normalizes Markdown formatting and punctuation, and inspects only the last meaningful line. All three prior call sites and _parse_planning_response are updated. A new TestCheckReady test suite is added.

Changes

Readiness Detection Refactor

Layer / File(s)	Summary
`_check_ready` implementation and call site updates `lib/crewai/src/crewai/utilities/reasoning_handler.py`	Adds `_check_ready(response: str) -> bool` static method with robust short-form and full-sentence detection, normalizing Markdown bold, headings, punctuation, and empty/separator lines. Updates the function-calling path (line 412), fallback path (line 436), and `_parse_planning_response` (line 619) to call `_check_ready()` instead of inline substring checks.
TestCheckReady unit tests `lib/crewai/tests/utilities/test_structured_planning.py`	Adds `TestCheckReady` covering `_check_ready` across full-sentence, short-form, trailing punctuation, `READY`/`NOT READY`, `## READY` headings, separator lines, `ALREADY DONE` false-positive guard, and empty input. Also tests `_parse_planning_response` for short-form `READY`/`NOT READY` and empty response defaults.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change—improving detection of short-form 'READY' responses in planning conclusions.
Linked Issues check	✅ Passed	The PR fully addresses issue `#6204` by implementing detection of bare 'READY' tokens and other formats that models output in practice.
Out of Scope Changes check	✅ Passed	All changes directly support the fix: new `_check_ready()` method handles improved readiness detection, and test suite validates the implementation.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/utilities/reasoning_handler.py`:
- Around line 596-602: The full-sentence READY match check currently scans the
entire response string, allowing the text "READY: I am ready to execute the
task." appearing anywhere in the response body or examples to immediately return
True and bypass the conclusion line evaluation. Move the full-sentence READY
match check to only apply to the conclusion variable (the final stripped line
being evaluated in the reversed loop), not to the entire response, so that
intermediate text in the response does not override the actual final conclusion
determination. This ensures that if the final conclusion line is "NOT READY",
the function will correctly return False instead of being short-circuited by an
earlier mention of the full sentence elsewhere in the response.

In `@lib/crewai/tests/utilities/test_structured_planning.py`:
- Around line 732-735: The test suite for AgentReasoning._check_ready is missing
a regression test for the case where a full READY sentence appears in the body
or middle of the response, but the final conclusion should still be NOT READY.
Add a new test method (similar to test_full_sentence_not_ready) that creates a
response containing the full READY sentence in the body followed by a NOT READY
conclusion, then verify that AgentReasoning._check_ready correctly returns False
for this case.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: af060d2d-276f-44d6-aff9-9e5ffae393af

📥 Commits

Reviewing files that changed from the base of the PR and between 431100d and 4f33415.

📒 Files selected for processing (2)

lib/crewai/src/crewai/utilities/reasoning_handler.py
lib/crewai/tests/utilities/test_structured_planning.py

coderabbitai · 2026-06-17T20:20:13Z

+        if "READY: I am ready to execute the task." in response:
+            return True
+        for line in reversed(response.splitlines()):
+            conclusion = line.strip().strip("*`#_-.!:> ").upper()
+            if not conclusion:
+                continue
+            return conclusion == "READY"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict the full-sentence READY match to the conclusion line.

Line 596 scans the entire response, so a body/example line containing READY: I am ready to execute the task. overrides a final NOT READY conclusion and stops refinement too early.

🐛 Proposed fix

- if "READY: I am ready to execute the task." in response: - return True for line in reversed(response.splitlines()): conclusion = line.strip().strip("*`#_-.!:> ").upper() if not conclusion: continue - return conclusion == "READY" + return conclusion in { + "READY", + "READY: I AM READY TO EXECUTE THE TASK", + } return False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/utilities/reasoning_handler.py` around lines 596 - 602, The full-sentence READY match check currently scans the entire response string, allowing the text "READY: I am ready to execute the task." appearing anywhere in the response body or examples to immediately return True and bypass the conclusion line evaluation. Move the full-sentence READY match check to only apply to the conclusion variable (the final stripped line being evaluated in the reversed loop), not to the entire response, so that intermediate text in the response does not override the actual final conclusion determination. This ensures that if the final conclusion line is "NOT READY", the function will correctly return False instead of being short-circuited by an earlier mention of the full sentence elsewhere in the response.

coderabbitai · 2026-06-17T20:20:13Z

+    def test_full_sentence_not_ready(self):
+        """Full-sentence NOT READY is correctly treated as not ready."""
+        response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."
+        assert AgentReasoning._check_ready(response) is False


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a regression for full-sentence READY in the body.

This suite misses the case that currently fails: a body/example line contains the full READY sentence, but the final conclusion is NOT READY.

🧪 Proposed test

def test_full_sentence_not_ready(self): """Full-sentence NOT READY is correctly treated as not ready.""" response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear." assert AgentReasoning._check_ready(response) is False + + def test_full_sentence_ready_in_body_with_not_ready_conclusion(self): + """Full-sentence READY in the body must not override final NOT READY.""" + response = ( + "The prompt says to answer READY: I am ready to execute the task.\n\n" + "NOT READY" + ) + assert AgentReasoning._check_ready(response) is False

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_full_sentence_not_ready(self):

"""Full-sentence NOT READY is correctly treated as not ready."""

response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."

assert AgentReasoning._check_ready(response) is False

def test_full_sentence_not_ready(self):

"""Full-sentence NOT READY is correctly treated as not ready."""

response = "My plan.\n\nNOT READY: I need to refine because the tools are unclear."

assert AgentReasoning._check_ready(response) is False

def test_full_sentence_ready_in_body_with_not_ready_conclusion(self):

"""Full-sentence READY in the body must not override final NOT READY."""

response = (

"The prompt says to answer READY: I am ready to execute the task.\n\n"

"NOT READY"

)

assert AgentReasoning._check_ready(response) is False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/utilities/test_structured_planning.py` around lines 732 - 735, The test suite for AgentReasoning._check_ready is missing a regression test for the case where a full READY sentence appears in the body or middle of the response, but the final conclusion should still be NOT READY. Add a new test method (similar to test_full_sentence_not_ready) that creates a response containing the full READY sentence in the body followed by a NOT READY conclusion, then verify that AgentReasoning._check_ready correctly returns False for this case.

The refine-plan and planning prompts instruct the model to conclude with a bare "READY" or "NOT READY", but the readiness parser only recognised the full sentence "READY: I am ready to execute the task." (emitted by the initial-plan prompt). A model replying with the short form was therefore read as "not ready", so the agent kept re-planning — indefinitely when max_attempts is None (crewAIInc#6204). Add an AgentReasoning._check_ready helper that accepts both the full-sentence form and the short "READY"/"NOT READY" form on the last meaningful line, stripping markdown emphasis/punctuation and skipping separator lines, while never matching "NOT READY". Used by _parse_planning_response and the two _call_with_function fallback paths.

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

i-anubhav-anand force-pushed the fix/reasoning-plan-ready-detection branch from 4f33415 to 119d246 Compare June 17, 2026 20:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(reasoning): detect short-form "READY" when concluding a plan#6206

fix(reasoning): detect short-form "READY" when concluding a plan#6206
i-anubhav-anand wants to merge 1 commit into
crewAIInc:mainfrom
i-anubhav-anand:fix/reasoning-plan-ready-detection

i-anubhav-anand commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

i-anubhav-anand commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

i-anubhav-anand commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading