add enable_prefill_decode_mixed start args by hiworldwzj · Pull Request #1315 · ModelTC/LightLLM

hiworldwzj · 2026-05-22T01:21:03Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces the --enable_prefill_decode_mixed flag, enabling the simultaneous execution of prefill and decode requests in a single scheduling step when run_mode is normal. This feature aims to enhance throughput by allowing decode progress even when new prefill requests are present. The implementation includes documentation updates, CLI argument additions, and logic in the base backend to merge request types when batch limits allow. Review feedback highlights the need for more precise token counting for decode requests to accommodate speculative decoding and suggests enforcing the documented incompatibility with microbatch overlap features.

gemini-code-assist · 2026-05-22T01:22:18Z

+        if self.args.enable_prefill_decode_mixed and len(prefill_reqs) > 0 and len(decode_reqs) > 0:
+            if prefill_tokens + len(decode_reqs) <= self.batch_max_tokens:
+                prefill_reqs.extend(decode_reqs)
+                decode_reqs = []


Using len(decode_reqs) as the token count for decode requests is only accurate if each request requires exactly one token. However, when features like Multi-Token Prediction (MTP) or speculative decoding are enabled, a decode request may require multiple tokens per step. It is safer to calculate the total required tokens using decode_need_token_num() to ensure the batch_max_tokens limit is not exceeded.

Suggested change

if self.args.enable_prefill_decode_mixed and len(prefill_reqs) > 0 and len(decode_reqs) > 0:

if prefill_tokens + len(decode_reqs) <= self.batch_max_tokens:

prefill_reqs.extend(decode_reqs)

decode_reqs = []

if self.args.enable_prefill_decode_mixed and len(prefill_reqs) > 0 and len(decode_reqs) > 0:

decode_tokens = sum(req.decode_need_token_num() for req in decode_reqs)

if prefill_tokens + decode_tokens <= self.batch_max_tokens:

prefill_reqs.extend(decode_reqs)

decode_reqs = []

gemini-code-assist · 2026-05-22T01:22:18Z

+    if args.enable_prefill_decode_mixed:
+        assert args.run_mode == "normal", "--enable_prefill_decode_mixed only supports run_mode normal"


The documentation for --enable_prefill_decode_mixed explicitly states that it cannot be used together with --enable_prefill_microbatch_overlap or --enable_decode_microbatch_overlap. This constraint should be enforced here to prevent invalid configurations.

Suggested change

if args.enable_prefill_decode_mixed:

assert args.run_mode == "normal", "--enable_prefill_decode_mixed only supports run_mode normal"

if args.enable_prefill_decode_mixed:

assert args.run_mode == "normal", "--enable_prefill_decode_mixed only supports run_mode normal"

assert not args.enable_prefill_microbatch_overlap and not args.enable_decode_microbatch_overlap, \

"--enable_prefill_decode_mixed cannot be used with microbatch overlap"

add enable_prefill_decode_mixed start args

5c12c4d

gemini-code-assist Bot reviewed May 22, 2026

View reviewed changes

fix

bc67708

hiworldwzj merged commit eaa3b28 into main May 22, 2026
1 check passed

hiworldwzj deleted the wzj_dev branch May 22, 2026 07:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add enable_prefill_decode_mixed start args#1315

add enable_prefill_decode_mixed start args#1315
hiworldwzj merged 2 commits into
mainfrom
wzj_dev

hiworldwzj commented May 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Uh oh!

gemini-code-assist Bot May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if args.enable_prefill_decode_mixed:
		assert args.run_mode == "normal", "--enable_prefill_decode_mixed only supports run_mode normal"

Conversation

hiworldwzj commented May 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant