Skip to content

Deferred tool resume executes pending tool without re-running Python SDK callback PreToolUse hook #993

@jdanielnd

Description

@jdanielnd

Summary

When using the Python SDK with in-process hook callbacks, a tool call deferred by a PreToolUse hook can later be resumed and executed without the resumed PreToolUse callback being invoked.

The same scenario works as expected when using settings-based command hooks, both through claude -p directly and through the Python SDK with ClaudeAgentOptions(settings=...): resume re-runs PreToolUse, and a deny decision prevents execution.

This looks specific to the Python SDK in-process callback hook path, not Claude Code CLI behavior and not SDK resume behavior generally.

Environment

  • Package: claude-agent-sdk==0.2.87
  • Python: 3.12.3
  • OS: Ubuntu 24.04, Linux 6.17.0-1014-oracle, aarch64
  • claude on PATH: 2.1.148
  • Python SDK bundled Claude Code binary: 2.1.150

The SDK README says the SDK uses the bundled CLI by default.

Relevant API Surface

query(*, prompt: str | AsyncIterable[dict[str, Any]], options: ClaudeAgentOptions | None = None, transport: Transport | None = None)
ClaudeSDKClient.connect(prompt: str | AsyncIterable[dict[str, Any]] | None = None)
ClaudeSDKClient.query(prompt: str | AsyncIterable[dict[str, Any]], session_id: str = "default")

query() does not accept None for prompt, so the SDK repro uses an empty async iterable for the resume call.

Expected Behavior

For a deferred tool call:

  1. Initial run calls a tool.
  2. PreToolUse returns permissionDecision: "defer".
  3. Result has stop_reason: "tool_deferred" and deferred_tool_use.
  4. Caller resumes the same session.
  5. The same pending tool call re-fires PreToolUse.
  6. If the resumed hook returns deny, the tool does not execute.
  7. If the resumed hook returns allow, the tool executes.

This matches the Claude Code hooks docs and direct CLI behavior.

References:

Actual Behavior With Python SDK In-Process Hook

  1. Initial SDK run defers correctly.
  2. Result has stop_reason: "tool_deferred" and deferred_tool_use.
  3. Resuming the session with an empty async iterable executes the deferred tool.
  4. The resumed PreToolUse callback is not invoked.
  5. A hook callback that would return deny never gets a chance to deny.

In a marker-file repro, the marker file is created during resume even though the resumed SDK hook is configured to deny all matching Bash tool calls.

Control Tests: Settings-Based Command Hooks Work Correctly

I tested the same flow with settings-based command hooks.

Direct SDK-Bundled CLI

Binary:

/path/to/.venv/lib/python3.12/site-packages/claude_agent_sdk/_bundled/claude

Version:

2.1.150 (Claude Code)

First run:

{
  "stop_reason": "tool_deferred",
  "session_id": "e29e7476-d08c-4f18-9432-84fe7df3bbda",
  "deferred_tool_use": {
    "id": "toolu_017S75KPULSQhjVBVQpihRSG",
    "name": "Bash",
    "input": {
      "command": "python3 -c \"from pathlib import Path; Path('/tmp/.../executed-marker.txt').write_text('ran'); print('ufficio_bundled_cli_deny_marker_7b20bb0e')\"",
      "description": "Write marker file and print identifier"
    }
  },
  "marker_exists_after_first": false
}

Resume with the hook phase changed to deny:

{
  "stop_reason": "end_turn",
  "result": "The command was denied by the environment, so I cannot produce its output.",
  "marker_exists_after_deny_resume": false,
  "hook_log": [
    {
      "phase": "defer",
      "event": "PreToolUse",
      "tool_name": "Bash",
      "tool_use_id": "toolu_017S75KPULSQhjVBVQpihRSG"
    },
    {
      "phase": "deny",
      "event": "PreToolUse",
      "tool_name": "Bash",
      "tool_use_id": "toolu_017S75KPULSQhjVBVQpihRSG"
    }
  ]
}

Result: direct CLI re-runs PreToolUse; deny prevents execution.

Python SDK With ClaudeAgentOptions(settings=...)

I also tested Python SDK query() while using a settings file that defines a command hook, with no Python callback hooks.

First run:

{
  "stop_reason": "tool_deferred",
  "session_id": "b56ed882-c4bf-4f28-be78-c8ed81ada224",
  "deferred": {
    "id": "toolu_013ZQwExFFxkoGyWcJcjr4E6",
    "name": "Bash",
    "input": {
      "command": "python3 -c \"from pathlib import Path; Path('/tmp/.../executed-marker.txt').write_text('ran'); print('ufficio_sdk_settings_deny_marker_a3587c56')\"",
      "description": "Run python command to write marker file"
    }
  },
  "marker_exists_after_first": false,
  "hook_log": [
    {
      "phase": "defer",
      "event": "PreToolUse",
      "tool_name": "Bash",
      "tool_use_id": "toolu_013ZQwExFFxkoGyWcJcjr4E6"
    }
  ]
}

Resume via query(prompt=empty_prompt(), options=ClaudeAgentOptions(resume=session_id, settings=settings_file, ...)) with the hook phase changed to deny:

{
  "stop_reason": "end_turn",
  "subtype": "success",
  "result": "The command was denied by a hook (\"sdk settings deny marker test\") and did not execute successfully, so there is no command output to report.",
  "marker_exists_after_deny_resume": false,
  "hook_log": [
    {
      "phase": "defer",
      "event": "PreToolUse",
      "tool_name": "Bash",
      "tool_use_id": "toolu_013ZQwExFFxkoGyWcJcjr4E6"
    },
    {
      "phase": "deny",
      "event": "PreToolUse",
      "tool_name": "Bash",
      "tool_use_id": "toolu_013ZQwExFFxkoGyWcJcjr4E6"
    }
  ]
}

Result: Python SDK resume works correctly when hooks are supplied through settings; the hook is re-run and the deny decision blocks execution. The failure appears specific to Python callback hooks supplied through ClaudeAgentOptions(hooks=...).

Python SDK Repro Result

First SDK run with Python callback hooks:

{
  "stop_reason": "tool_deferred",
  "session_id": "f3aea1f7-3f55-4b6e-9495-10a93d1f603e",
  "deferred": {
    "id": "toolu_01EmiiT3R2NCBpvhbydqJ23L",
    "name": "Bash",
    "input": {
      "command": "mkdir -p /tmp/... && python3 -c \"from pathlib import Path; Path('/tmp/.../executed-marker.txt').write_text('ran'); print('ufficio_sdk_callback_deny_marker_8c702178')\"",
      "description": "Create parent directory and run python command"
    }
  },
  "marker_exists_after_first": false,
  "hook_calls": [
    {
      "phase": "defer",
      "tool_use_id": "toolu_01EmiiT3R2NCBpvhbydqJ23L",
      "tool_name": "Bash"
    }
  ]
}

Resume via query(prompt=empty_prompt(), options=ClaudeAgentOptions(resume=session_id, hooks=...deny...)):

{
  "error": null,
  "message_types": ["UserMessage", "SystemMessage", "AssistantMessage", "ResultMessage"],
  "stop_reason": "end_turn",
  "result": "ufficio_sdk_callback_deny_marker_8c702178",
  "marker_exists_after_deny_resume": true,
  "marker_content": "ran",
  "hook_calls": [
    {
      "phase": "defer",
      "tool_use_id": "toolu_01EmiiT3R2NCBpvhbydqJ23L",
      "tool_name": "Bash"
    }
  ]
}

Result: the pending tool executed and created the marker file. The resumed deny hook was not called.

Additional Observation

Calling the top-level SDK helper as query(prompt=None, options=ClaudeAgentOptions(resume=session_id, ...)) is not typed as supported and timed out in a live test. The viable resume path through query() appears to be an empty async iterable.

Suspected Cause

The observed behavior appears consistent with SDK callback hooks being registered too late for resumed deferred tool calls.

In the Python SDK query() path, the subprocess is started with --resume before SDK callback hooks are sent over the control protocol:

  1. InternalClient._process_query_inner(...) creates SubprocessCLITransport(prompt=prompt, options=configured_options).
  2. It calls await chosen_transport.connect(), which starts the Claude Code subprocess.
  3. SubprocessCLITransport._build_command() includes --resume <session_id> and --settings <...> in the CLI command line.
  4. Only after the subprocess is connected does the SDK create Query(..., hooks=...).
  5. Then query.initialize() sends the Python callback hook IDs over stdin via the SDK control protocol.

Settings-based command hooks are available to the CLI at process startup because they are passed through --settings, so they are already registered when the deferred tool is replayed.

Python callback hooks supplied through ClaudeAgentOptions(hooks=...) are not available until after the subprocess starts and the SDK initialize request is processed. If Claude Code replays the deferred tool during resume startup before that initialize request installs the callback hook mapping, the pending tool executes without invoking the Python callback.

That ordering explains all observed cases:

  • Direct CLI + settings hook: works.
  • SDK + settings=... command hook: works.
  • SDK + Python callback hook: deferred resume executes without calling the resumed callback.

Minimal Reproduction Shape

import asyncio
from pathlib import Path
from uuid import uuid4
from typing import Any

from claude_agent_sdk import ClaudeAgentOptions, HookMatcher, query


base = Path("/tmp/claude-sdk-defer-repro") / str(uuid4())
work = base / "work"
config = base / "config"
work.mkdir(parents=True)
config.mkdir(parents=True)
marker = base / "executed-marker.txt"
token = "sdk_defer_repro_" + uuid4().hex[:8]
command = (
    "python3 -c "
    f"\"from pathlib import Path; Path({str(marker)!r}).write_text('ran'); "
    f"print({token!r})\""
)
hook_calls = []


def field(obj: Any, name: str) -> Any:
    return obj.get(name) if isinstance(obj, dict) else getattr(obj, name, None)


def options(hook, resume: str | None = None) -> ClaudeAgentOptions:
    kwargs = {
        "system_prompt": "Use Bash exactly once when asked.",
        "cwd": str(work),
        "env": {
            # set ANTHROPIC_API_KEY here or inherit it
            "CLAUDE_CONFIG_DIR": str(config),
        },
        "setting_sources": [],
        "tools": ["Bash"],
        "allowed_tools": ["Bash"],
        "hooks": {"PreToolUse": [HookMatcher(matcher=None, hooks=[hook])]},
        "max_turns": 4,
    }
    if resume:
        kwargs["resume"] = resume
    return ClaudeAgentOptions(**kwargs)


async def empty_prompt():
    if False:
        yield {}


async def collect(prompt, opts):
    messages = []
    async for message in query(prompt=prompt, options=opts):
        messages.append(message)
    return messages


async def main():
    async def defer_hook(hook_input, tool_use_id, _context):
        hook_calls.append(("defer", tool_use_id, field(hook_input, "tool_input")))
        return {
            "hookSpecificOutput": {
                "hookEventName": "PreToolUse",
                "permissionDecision": "defer",
                "permissionDecisionReason": "defer repro",
            }
        }

    first = await collect(
        f"Use Bash exactly once to run this exact command: {command}.",
        options(defer_hook),
    )
    first_result = next(m for m in first if m.__class__.__name__ == "ResultMessage")
    assert first_result.stop_reason == "tool_deferred"
    assert first_result.deferred_tool_use is not None
    assert not marker.exists()

    async def deny_hook(hook_input, tool_use_id, _context):
        hook_calls.append(("deny", tool_use_id, field(hook_input, "tool_input")))
        return {
            "hookSpecificOutput": {
                "hookEventName": "PreToolUse",
                "permissionDecision": "deny",
                "permissionDecisionReason": "deny repro",
            }
        }

    second = await collect(
        empty_prompt(),
        options(deny_hook, resume=first_result.session_id),
    )
    second_result = next(m for m in second if m.__class__.__name__ == "ResultMessage")

    print("second result:", second_result.result)
    print("marker exists:", marker.exists())
    print("hook calls:", hook_calls)

    # Expected:
    # marker exists: False
    # hook calls includes a "deny" call for the resumed deferred tool
    #
    # Actual observed:
    # marker exists: True
    # hook calls does not include any "deny" call


asyncio.run(main())

Impact

This affects applications that use Python SDK in-process hooks to implement out-of-process or human-in-the-loop approval.

If an application stores deferred_tool_use, asks a user for approval, and later resumes the Claude session with SDK hooks configured to allow/deny based on the user decision, a rejected decision may not be enforceable by the resumed SDK hook. The deferred tool may execute without the resumed callback being called.

Workaround in my application:

  • If the user approves, resume the Claude session.
  • If the user rejects, do not resume the Claude session.
  • Treat approval as a platform decision made before resume, not as a decision made by a resumed SDK hook.

That workaround is safe for rejection, but it means the SDK in-process callback hook behavior does not match settings-based command hook behavior or the documented defer/resume model.

Question

Is this a known limitation of Python SDK in-process callback hooks with deferred resume, or should query(..., resume=session_id, hooks=...) re-register SDK callback hooks in time for the pending deferred tool call to re-fire PreToolUse?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions