Skip to content

Comments

Allow cosmos tests to run in parallel v2#37766

Draft
JoasE wants to merge 18 commits intodotnet:mainfrom
JoasE:feature/cosmos-test-paralellism-v2
Draft

Allow cosmos tests to run in parallel v2#37766
JoasE wants to merge 18 commits intodotnet:mainfrom
JoasE:feature/cosmos-test-paralellism-v2

Conversation

@JoasE
Copy link
Contributor

@JoasE JoasE commented Feb 21, 2026

With some spare time, I've been experimenting around with the cosmos functional tests to see if I can get them to run in parallel. With an old attempt (See: #37696) being possibly unstable in the future or with a certain combination of tests, this PR aims to allow parallel test execution by manually controlling parallelism taking into account limitations of the emulator when needed.

The biggest challenge for parallel execution appeared to be the cosmos db emulator, which can become unstable in certain scenario's.
After some testing I found that the create container implementation in the emulator has a race condition, and can crash the emulator when ran in parallel. Other CRUD requests for containers appear to have no direct effect on the running of the emulator, but parallel requests can cause the emulator to have to clear the data upon restart.
In addition, the docs state that the emulator could show performance degradation with more than 10 concurrent containers.
On top of that, my tests have shown the emulator will stop responding to create container requests at some point when creating many containers. On my local machine this happens consistently after the creation of the 25th container.

This means that when running the tests against the emulator we can't execute CRUD container requests in parallel and ideally no more than 10 containers exist at the same time, and never more than 25.

This could be fixed by just decreasing the parallelism using CollectionBehavior(MaxParallelThreads = N), but this is risky as it will cause errors or slow tests when many tests are added that create a lot of containers and those happen to run in parallel. See: #37696

This PR implements manual parallelism control in CosmosEmulatorTestStore based on the number of containers each test store is attempting to create. This is both faster and prevents possible errors, but there are a couple of downsides, being mainly:

  • Test execution time of non-shared model tests are affected by the wait time introduced by the manual parallelism control.
  • Complex-ish logic simply for running tests faster, albeit only scoped to tests running against the emulator
  • Cloud tests also have manual parallelism control, albeit much simpler.

Some other challenges were:

  • Different test classes might use the same shared test fixture, so we have to keep track of which stores are being used by which tests so we know when we can delete them.
  • Some non-shared model tests use the same test store, as they inherit from other non-shared model tests. These tests can't always be ran in parallel against the same store, so they need a unique store.
  • Some tests create containers via other means than the test store, which could cause parallel create container requests. A lock is needed for these tests to prevent errors

There was a custom import for a cosmos specific Northwind database, which didn't appear to be needed. The custom import was very slow compared to using savechange's new batching, so I removed it and had to slightly change 1 test.

This PR decreases total test run time:
locally:
from 26 to 12.5 minutes
pipeline:
From ~30+ minutes to ~20 minutes total run time

Run against real cosmos db: Done

@JoasE JoasE changed the title Allow cosmos tests to run in parallel v2 WIP: Allow cosmos tests to run in parallel v2 Feb 21, 2026
@JoasE JoasE marked this pull request as ready for review February 22, 2026 13:47
@JoasE JoasE requested a review from a team as a code owner February 22, 2026 13:47
Copilot AI review requested due to automatic review settings February 22, 2026 13:47
@JoasE JoasE marked this pull request as draft February 22, 2026 13:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to enable parallel execution of EF Core Cosmos functional tests by introducing emulator-/cloud-specific CosmosTestStore implementations and centralizing test-store creation/initialization, while also simplifying Northwind seeding to avoid slow custom imports.

Changes:

  • Introduces CosmosEmulatorTestStore/CosmosCloudTestStore and updates CosmosTestStoreFactory/CosmosTestStore to route behavior based on TestEnvironment.IsEmulator.
  • Enables assembly-level parallelization for Cosmos functional tests and adjusts various tests/fixtures for isolation and stability.
  • Removes the Northwind JSON-based import path and updates affected Northwind-related tests/fixtures.

Reviewed changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/EFCore.Cosmos.Tests/Extensions/CosmosDbContextOptionsExtensionsTests.cs Switches test store creation to CosmosTestStoreFactory.
test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkExecutionTest.cs Adjusts store naming/fixture usage for parallel execution.
test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkEndToEndTest.cs Sets explicit store name for isolation.
test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkEndToEndTestNoBatching.cs Sets explicit store name for isolation.
test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkConcurrencyTest.cs Renames fixture and sets explicit store name for isolation.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/TestEnvironment.cs Updates parsing/typing of Cosmos test environment configuration.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStoreFactory.cs Centralizes store creation, selecting emulator vs cloud store implementations.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs Refactors to an abstract base store and adjusts initialization/cleanup behavior.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs Adds emulator-specific locking and container-count concurrency control for parallel tests.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs Adds cloud-specific throttling of concurrent initialization.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosDbConfiguredConditionAttribute.cs Moves connection-availability probing out of CosmosTestStore into the condition attribute.
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosNorthwindTestStoreFactory.cs Removes Cosmos-specific Northwind store factory (and its JSON import path).
test/EFCore.Cosmos.FunctionalTests/Query/NorthwindQueryCosmosFixture.cs Switches Northwind fixture to use the general CosmosTestStoreFactory.
test/EFCore.Cosmos.FunctionalTests/Query/FromSqlQueryCosmosTest.cs Updates an expected SQL predicate and result count following seeding changes.
test/EFCore.Cosmos.FunctionalTests/Query/OwnedQueryCosmosTest.cs Removes redundant theory attributes from various overrides.
test/EFCore.Cosmos.FunctionalTests/Properties/TestAssemblyCondition.cs Enables parallelism via CollectionBehavior(MaxParallelThreads = -1).
test/EFCore.Cosmos.FunctionalTests/Storage/CosmosDatabaseCreatorTest.cs Updates test store usage and adds emulator locking helper.
test/EFCore.Cosmos.FunctionalTests/PartitionKeyTest.cs Removes redundant EnsureCreatedAsync calls (relying on store lifecycle).
test/EFCore.Cosmos.FunctionalTests/HierarchicalPartitionKeyTest.cs Removes redundant EnsureCreatedAsync calls (relying on store lifecycle).
test/EFCore.Cosmos.FunctionalTests/EndToEndCosmosTest.cs Updates initialized store creation and removes redundant EnsureCreatedAsync calls.
test/EFCore.Cosmos.FunctionalTests/EmbeddedDocumentsTest.cs Switches fixture store creation to CosmosTestStoreFactory.
test/EFCore.Cosmos.FunctionalTests/CosmosTriggersTest.cs Removes redundant EnsureCreatedAsync call prior to trigger creation.
test/EFCore.Cosmos.FunctionalTests/CosmosSessionTokensTest.cs Adds collection scoping and adjusts session token storage plumbing for parallel safety.
test/EFCore.Cosmos.FunctionalTests/ConnectionSpecificationTest.cs Switches test store creation to CosmosTestStoreFactory.
test/EFCore.Cosmos.FunctionalTests/ConfigPatternsCosmosTest.cs Switches initialized store creation to factory and routes container creation via test store.
test/EFCore.Cosmos.FunctionalTests/EFCore.Cosmos.FunctionalTests.csproj Removes Northwind.json copy-to-output configuration.
Comments suppressed due to low confidence (7)

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs:239

  • WaitChangeContainerCountAsync performs awaited DeleteContainersAsync work inside GetWaitTask, which is awaited while holding _concurrencyControlSemaphore. This can block other wait/release operations for long periods (and increases deadlock risk). Consider computing required actions under the lock, releasing it, then performing container deletion/waits outside the lock and looping as needed.
                // Re enter the queue.
                if (_databaseCreated)
                {
                    await DeleteContainersAsync(dbContext); // @TODO: Does this have to happen inside the lock?
                }

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs:145

  • CreatedContainersAsync ignores the provided cancellationToken by passing CancellationToken.None to EnsureCreatedAsync. This prevents callers from canceling long-running initialization; pass through the cancellationToken instead.
    public virtual async Task<bool> CreatedContainersAsync(DbContext context, bool skipTokenCredential = false, CancellationToken cancellationToken = default)
    {
        if (!TestEnvironment.UseTokenCredential || skipTokenCredential)
        {
            return await context.Database.EnsureCreatedAsync(CancellationToken.None).ConfigureAwait(false);
        }

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs:49

  • The 'shared' parameter in CreateName is unused. Either remove it or incorporate it into naming (if shared stores need stable names while non-shared need uniqueness), to avoid confusion about intended behavior.
    private static string CreateName(string name, bool shared)
    {
        return !TestEnvironment.IsEmulator ? name + _runId : name;
    }

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs:11

  • Typo in field name '_aquired' (should be '_acquired') which deviates from the codebase naming conventions and makes the code harder to read/search.

    private bool _aquired;

test/EFCore.Cosmos.FunctionalTests/TestUtilities/TestEnvironment.cs:36

  • UseTokenCredential is parsed with a case-sensitive string comparison to "True". This will treat common values like "true"/"FALSE" (e.g., from env vars) as false; consider using bool.TryParse or a case-insensitive comparison to avoid misconfiguration.
    public static bool UseTokenCredential { get; } = Config["UseTokenCredential"] == "True";

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs:16

  • ContainerCrudSemaphore is created with a max count of int.MaxValue (new SemaphoreSlim(1)). Using new SemaphoreSlim(1, 1) helps catch accidental over-release and preserves mutual exclusion guarantees.
    /// <summary>
    /// The emulator becomes unstable with parallel CRUD container requests, use this lock to create containers.
    /// </summary>
    public static readonly SemaphoreSlim ContainerCrudSemaphore = new(1);

test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs:9

  • The SemaphoreSlim is constructed with only an initial count (max defaults to int.MaxValue). Consider specifying a maxCount (e.g., new SemaphoreSlim(Environment.ProcessorCount, Environment.ProcessorCount)) to catch accidental over-release bugs.
    private static readonly SemaphoreSlim _semaphore = new(Environment.ProcessorCount);

{
if (_initialized
&& _dataFilePath == null)
if (_initialized && !Shared)
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DisposeAsync no longer deletes databases for Shared test stores. For runs against a real Cosmos DB account this can leak databases/containers across test runs and incur cost/quota issues. Consider deleting the shared database when the fixture/test run completes (e.g., reference-count shared stores like CosmosEmulatorTestStore does, or otherwise ensure shared stores are cleaned up deterministically).

Suggested change
if (_initialized && !Shared)
if (_initialized)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a similar tracking mechanism for real cosmos db runs to delete the database if no longer used? I thought it would just be simpler to delete the cosmos account in the portal

@JoasE JoasE changed the title WIP: Allow cosmos tests to run in parallel v2 Allow cosmos tests to run in parallel v2 Feb 22, 2026
@JoasE
Copy link
Contributor Author

JoasE commented Feb 22, 2026

Think I will wait for #37681 with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant