Maximum token limit per request causes failure with large `system_prompt_len`

**What happened**:
Running inference-perf with a large system_prompt_len (25000 tokens) causes the tool to fail with the following error:
```
File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 388, in readuntil
    raise ValueError("Chunk too big")
```
**How to reproduce it (as minimally and precisely as possible)**:
Use a shared_prefix workload with a large system prompt:
```
type: shared_prefix
  shared_prefix:
    num_groups: 20              
    num_prompts_per_group: 2      
    system_prompt_len: 25000   
```  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Maximum token limit per request causes failure with large `system_prompt_len` #288

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Maximum token limit per request causes failure with large system_prompt_len #288

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Maximum token limit per request causes failure with large `system_prompt_len` #288