Batch API: tool_use input wrapped in $PARAMETER_NAME placeholder key (claude-opus-4-7)

## Summary

When using the Messages Batch API with forced `tool_use` (`tool_choice={"type":"tool","name":"..."}`) on `claude-opus-4-7`, the returned tool input is frequently wrapped in a single key `$PARAMETER_NAME` instead of containing the schema's actual property names at the top level. The same prompt and schema, sent via `client.messages.create` (sync), returns the correct top-level keys 100% of the time.

## Environment

- SDK: `anthropic` (Python), latest stable
- Model: `claude-opus-4-7`
- Endpoint: `client.messages.batches.create` / `.retrieve` / `.results`
- Tool config: `tool_choice={"type":"tool","name":"save_topic_synthesis"}` (forced)

## Expected vs Observed

**Expected:**
```python
tool_use.input == {
  "tldr": "...",
  "mechanism_summary": "...",
  "evidence_strength": "emerging",
  ...
}
```

**Observed (most items):**
```python
tool_use.input == {
  "$PARAMETER_NAME": {
    "tldr": "...",
    "mechanism_summary": "...",
    ...
  }
}
```

The literal string `$PARAMETER_NAME` does not appear anywhere in our prompt, tool schema, or system message.

## Distribution from a single batch (10 messages, all `stop_reason=tool_use`)

| Request | Top-level keys | Output tokens |
|---|---|---|
| 1 | `$PARAMETER_NAME` (full inner) | 4573 |
| 2 | `$PARAMETER_NAME` (full inner) | 5668 |
| 3 | `$PARAMETER_NAME` (full inner) | 5221 |
| 4 | `$PARAMETER_NAME` (full inner) | 5513 |
| 5 | `$PARAMETER_NAME` (full inner) | 4853 |
| 6 | mixed: `$PARAMETER_NAME` + top-level `tldr`/`mechanism_summary` | 5525 |
| 7 | `topic` (different wrap key, single-key dict) | 4929 |
| 8 | `$PARAMETER_NAME` (**empty inner dict**) | **49** |
| 9 | top-level (correct) | 5353 |
| 10 | top-level (correct) | 5915 |

8/10 had non-conforming top-level shape; 2/10 conformed. Pattern repeated in a second independent batch with different inputs (same 10 messages re-submitted on a different day after schema/prompt changes): again 1 item came back stuck (`$PARAMETER_NAME: {}`, ~50 output tokens, smallest input).

## Possibly related: tool_use that stops early with empty wrapper

Item 8 above: `stop_reason="tool_use"` but only 49 output tokens emitted, and the `$PARAMETER_NAME` wrapper was an empty dict. Recurring across our two batches: the failing item is consistently the one with the smallest prompt input (~10K tokens vs. ~17K average). Both `stop_reason` and `tool_use` block presence suggest the model thinks it completed successfully.

## Reproducer (sketch)

```python
import anthropic
client = anthropic.Anthropic()

schema = {
    "type": "object",
    "properties": {
        "tldr": {"type": "string", "description": "..."},
        "mechanism_summary": {"type": "string"},
        "evidence_strength": {"type": "string", "enum": ["consolidated","emerging","speculative"]},
        # ...several more required string/array fields...
    },
    "required": ["tldr", "mechanism_summary", "evidence_strength", ...],
}

batch = client.messages.batches.create(requests=[
    {
        "custom_id": f"item-{i}",
        "params": {
            "model": "claude-opus-4-7",
            "max_tokens": 8000,
            "system": [{"type": "text", "text": "<system prompt>", "cache_control": {"type": "ephemeral"}}],
            "tools": [{
                "name": "save_topic_synthesis",
                "description": "Save structured composition.",
                "input_schema": schema,
            }],
            "tool_choice": {"type": "tool", "name": "save_topic_synthesis"},
            "messages": [{"role": "user", "content": "<long prompt with ~15-20K tokens of structured input>"}],
        },
    }
    for i in range(10)
])
# wait for batch.processing_status == "ended", then for line in client.messages.batches.results(batch.id):
#   inspect line.result.message.content[0].input
# Expect: most items wrap content in {"$PARAMETER_NAME": {...}}
```

## Impact

Without client-side defensive unwrapping, the majority of batch responses fail strict downstream validation (Pydantic, JSON Schema validators) because expected fields are nested one level too deep. Forces every batch consumer to add a workaround like:

```python
if len(input) == 1:
    sole = next(iter(input.values()))
    if isinstance(sole, dict) and "<sentinel_field>" in sole:
        input = sole
```

Compounded with the stuck-early issue, the practical failure rate on a fresh batch is ~10% even with that workaround.

## Workaround in place on our side

Detect `len(input)==1` and unwrap; on `<sentinel>=None` (stuck output), re-issue the same request as a synchronous `messages.create` and accept the result. This pattern is described and committed at our repo if useful as reference.

## Ask

Could the SDK / Batch API surface either (a) normalize the response shape to match sync mode, or (b) document the wrap behavior so consumers know to expect it? Currently the SDK type hints suggest top-level fields per the declared schema, which is misleading.

---

## Update (2026-05-29): retested on `claude-opus-4-8` + minimal reproducer

Re-tested after the 4.8 release. **The `$PARAMETER_NAME` single-key wrapper no longer reproduces on `claude-opus-4-8`** (0/10 on the original workload that previously showed 8/10). However, the root issue — tool-input serialization leaking into `tool_use.input` — persists in a new form, and a minimal reproducer narrows the trigger.

**Trigger isolated:** a nested `array<object>` sub-schema. A flattened schema (scalars / arrays-of-scalars only) was clean in **28/28** runs across both models and both transports. The nested schema breaks on both.

**Failure mode by version:**
- `opus-4-7`: top-level wrapped in a single non-schema key — `$PARAMETER_NAME`, `topic`, **or `input`** (the last not in the original report).
- `opus-4-8`: internal tool-call markup (`<parameter name="...">` / `</parameter>`) leaks *into a string field value*, absorbing the following sibling field — so a `required` field silently disappears while the object still looks well-formed.

**Corrections to the original report:**
1. **Not batch-only.** It also reproduces via `client.messages.create` (sync), at a lower rate than batch. The original "sync 100% correct" was likely a small-sample / schema-specific effect.
2. Wrapper-key set on 4.7 includes `input` in addition to `$PARAMETER_NAME` and `topic`.

Environment: SDK `anthropic` 0.104.1, `anthropic-version: 2023-06-01`. Reproducer script and per-request IDs available on request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch API: tool_use input wrapped in $PARAMETER_NAME placeholder key (claude-opus-4-7) #1607

Summary

Environment

Expected vs Observed

Distribution from a single batch (10 messages, all `stop_reason=tool_use`)

Possibly related: tool_use that stops early with empty wrapper

Reproducer (sketch)

Impact

Workaround in place on our side

Ask

Update (2026-05-29): retested on `claude-opus-4-8` + minimal reproducer

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Request	Top-level keys	Output tokens
1	`$PARAMETER_NAME` (full inner)	4573
2	`$PARAMETER_NAME` (full inner)	5668
3	`$PARAMETER_NAME` (full inner)	5221
4	`$PARAMETER_NAME` (full inner)	5513
5	`$PARAMETER_NAME` (full inner)	4853
6	mixed: `$PARAMETER_NAME` + top-level `tldr`/`mechanism_summary`	5525
7	`topic` (different wrap key, single-key dict)	4929
8	`$PARAMETER_NAME` (empty inner dict)	49
9	top-level (correct)	5353
10	top-level (correct)	5915

Batch API: tool_use input wrapped in $PARAMETER_NAME placeholder key (claude-opus-4-7) #1607

Description

Summary

Environment

Expected vs Observed

Distribution from a single batch (10 messages, all stop_reason=tool_use)

Possibly related: tool_use that stops early with empty wrapper

Reproducer (sketch)

Impact

Workaround in place on our side

Ask

Update (2026-05-29): retested on claude-opus-4-8 + minimal reproducer

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Distribution from a single batch (10 messages, all `stop_reason=tool_use`)

Update (2026-05-29): retested on `claude-opus-4-8` + minimal reproducer