Skip to content

Orchestrations get stuck awaiting completions of activities that are launched in parallel #681

@PDHCoder

Description

@PDHCoder

The bottom line is, I want to implement an Orchestration semaphore using a Durable Entity, that limits the number of concurrent calls to a certain resource provider (in a real world function app this will limit the number of concurrent LLM calls).

In the demo app I attached, I added a REST function that can be called with a POST request (e.g. http://localhost:7238/api/start_orchestration/123456), and this starts an Orchestration of type MainOrchestrator. That orchestration starts some page processing activities in parallel (but batched to a certain limit), when these are all completed, 2 orchestrations of type SubOrchestrator are started. These also start some text processing activities in parallel (but batched to a certain limit).
Each activity, i.e. the page processing activities and the text processing activities, are guarded by an orchestration semaphore (cfr. helper class GlobalLlmLimiterSemaphore, limits max concurrent activities to 100).
The problem is this: I often notice, in local dev and on Azure, that the orchestrations get stuck awaiting the completions of activities, while all activities DO complete and there are no exceptions of any kind.
It seems like the completion of activities does not trigger the orchestration function to replay at a certain point.
Can someone explain what I am doing wrong, or is this a bug?

The demo solution can be downloaded from https://www.dropbox.com/scl/fi/0l9abvmnec4r5i4bc3e3s/OrchestrationSemaphore.zip?rlkey=hsi4xq9wu0z3jr8aadyf47328&dl=0
I have also attached a screenshot from the Visual Studio Code Durable Functions extension, that shows how one SubOrchestrator completed and another did not; since one SubOrchestrator never completes, this results in the MainOrchestrator to also not complete: https://www.dropbox.com/scl/fi/wyxawmvtvehwdnsmeru5y/2026-03-19_08h42_40.png?rlkey=72q3v6xzgt8obfl221skb1wvk&dl=0
Note that 2 runs with the MainOrchestrator and 2 SubOrchestrators ran without any issues, the 3rd run got stuck. The fact that the main and sub orchestrations can run successfully, already rules out some possible causes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions