Skip to content

Commit c6b117e

Browse files
committed
fix(tracing): address greptile review findings
- Re-check backpressure before dispatching the END task so a batch carrying both event types can't push _inflight past the concurrency cap (the semaphore was already the hard limit; this tightens the in-flight task bound to match). - Document the retry-ordering caveat directly in _reenqueue: a re-enqueued START goes to the back of the queue and may miss a concurrently-dispatched END's barrier snapshot when retries are enabled (benign at the default max_retries=1).
1 parent 512c843 commit c6b117e

1 file changed

Lines changed: 12 additions & 1 deletion

File tree

src/agentex/lib/core/tracing/span_queue.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,10 @@ async def _drain_loop(self) -> None:
236236
if starts:
237237
self._dispatch(starts, SpanEventType.START)
238238
if ends:
239+
# Re-check backpressure before the second dispatch so a batch
240+
# carrying both event types can't push _inflight past the cap.
241+
while len(self._inflight) >= self._concurrency:
242+
await asyncio.wait(set(self._inflight), return_when=asyncio.FIRST_COMPLETED)
239243
self._dispatch(ends, SpanEventType.END)
240244

241245
def _dispatch(self, items: list[_SpanQueueItem], event_type: SpanEventType) -> None:
@@ -342,7 +346,14 @@ def _handle_failure(
342346

343347
def _reenqueue(self, item: _SpanQueueItem, p: AsyncTracingProcessor) -> None:
344348
"""Put a single failed item back on the queue, scoped to the processor
345-
that failed, with an incremented attempt count."""
349+
that failed, with an incremented attempt count.
350+
351+
NOTE: a re-enqueued START goes to the *back* of the queue. If an END
352+
for the same span is dispatched concurrently before this START is picked
353+
up again, the END's barrier snapshot won't contain it, breaking the
354+
START-before-END guarantee for that span. This is benign at the default
355+
``max_retries=1`` (retries disabled) but must be addressed before
356+
enabling retries by default."""
346357
try:
347358
self._queue.put_nowait(
348359
_SpanQueueItem(

0 commit comments

Comments
 (0)