Allow cosmos tests to run in parallel v2#37766
Conversation
Use guid for store name
…e collection calls and docs
…cosmos-test-parallelism
There was a problem hiding this comment.
Pull request overview
This PR aims to enable parallel execution of EF Core Cosmos functional tests by introducing emulator-/cloud-specific CosmosTestStore implementations and centralizing test-store creation/initialization, while also simplifying Northwind seeding to avoid slow custom imports.
Changes:
- Introduces
CosmosEmulatorTestStore/CosmosCloudTestStoreand updatesCosmosTestStoreFactory/CosmosTestStoreto route behavior based onTestEnvironment.IsEmulator. - Enables assembly-level parallelization for Cosmos functional tests and adjusts various tests/fixtures for isolation and stability.
- Removes the Northwind JSON-based import path and updates affected Northwind-related tests/fixtures.
Reviewed changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/EFCore.Cosmos.Tests/Extensions/CosmosDbContextOptionsExtensionsTests.cs | Switches test store creation to CosmosTestStoreFactory. |
| test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkExecutionTest.cs | Adjusts store naming/fixture usage for parallel execution. |
| test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkEndToEndTest.cs | Sets explicit store name for isolation. |
| test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkEndToEndTestNoBatching.cs | Sets explicit store name for isolation. |
| test/EFCore.Cosmos.FunctionalTests/Update/CosmosBulkConcurrencyTest.cs | Renames fixture and sets explicit store name for isolation. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/TestEnvironment.cs | Updates parsing/typing of Cosmos test environment configuration. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStoreFactory.cs | Centralizes store creation, selecting emulator vs cloud store implementations. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs | Refactors to an abstract base store and adjusts initialization/cleanup behavior. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs | Adds emulator-specific locking and container-count concurrency control for parallel tests. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs | Adds cloud-specific throttling of concurrent initialization. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosDbConfiguredConditionAttribute.cs | Moves connection-availability probing out of CosmosTestStore into the condition attribute. |
| test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosNorthwindTestStoreFactory.cs | Removes Cosmos-specific Northwind store factory (and its JSON import path). |
| test/EFCore.Cosmos.FunctionalTests/Query/NorthwindQueryCosmosFixture.cs | Switches Northwind fixture to use the general CosmosTestStoreFactory. |
| test/EFCore.Cosmos.FunctionalTests/Query/FromSqlQueryCosmosTest.cs | Updates an expected SQL predicate and result count following seeding changes. |
| test/EFCore.Cosmos.FunctionalTests/Query/OwnedQueryCosmosTest.cs | Removes redundant theory attributes from various overrides. |
| test/EFCore.Cosmos.FunctionalTests/Properties/TestAssemblyCondition.cs | Enables parallelism via CollectionBehavior(MaxParallelThreads = -1). |
| test/EFCore.Cosmos.FunctionalTests/Storage/CosmosDatabaseCreatorTest.cs | Updates test store usage and adds emulator locking helper. |
| test/EFCore.Cosmos.FunctionalTests/PartitionKeyTest.cs | Removes redundant EnsureCreatedAsync calls (relying on store lifecycle). |
| test/EFCore.Cosmos.FunctionalTests/HierarchicalPartitionKeyTest.cs | Removes redundant EnsureCreatedAsync calls (relying on store lifecycle). |
| test/EFCore.Cosmos.FunctionalTests/EndToEndCosmosTest.cs | Updates initialized store creation and removes redundant EnsureCreatedAsync calls. |
| test/EFCore.Cosmos.FunctionalTests/EmbeddedDocumentsTest.cs | Switches fixture store creation to CosmosTestStoreFactory. |
| test/EFCore.Cosmos.FunctionalTests/CosmosTriggersTest.cs | Removes redundant EnsureCreatedAsync call prior to trigger creation. |
| test/EFCore.Cosmos.FunctionalTests/CosmosSessionTokensTest.cs | Adds collection scoping and adjusts session token storage plumbing for parallel safety. |
| test/EFCore.Cosmos.FunctionalTests/ConnectionSpecificationTest.cs | Switches test store creation to CosmosTestStoreFactory. |
| test/EFCore.Cosmos.FunctionalTests/ConfigPatternsCosmosTest.cs | Switches initialized store creation to factory and routes container creation via test store. |
| test/EFCore.Cosmos.FunctionalTests/EFCore.Cosmos.FunctionalTests.csproj | Removes Northwind.json copy-to-output configuration. |
Comments suppressed due to low confidence (7)
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs:239
- WaitChangeContainerCountAsync performs awaited DeleteContainersAsync work inside GetWaitTask, which is awaited while holding _concurrencyControlSemaphore. This can block other wait/release operations for long periods (and increases deadlock risk). Consider computing required actions under the lock, releasing it, then performing container deletion/waits outside the lock and looping as needed.
// Re enter the queue.
if (_databaseCreated)
{
await DeleteContainersAsync(dbContext); // @TODO: Does this have to happen inside the lock?
}
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs:145
- CreatedContainersAsync ignores the provided cancellationToken by passing CancellationToken.None to EnsureCreatedAsync. This prevents callers from canceling long-running initialization; pass through the cancellationToken instead.
public virtual async Task<bool> CreatedContainersAsync(DbContext context, bool skipTokenCredential = false, CancellationToken cancellationToken = default)
{
if (!TestEnvironment.UseTokenCredential || skipTokenCredential)
{
return await context.Database.EnsureCreatedAsync(CancellationToken.None).ConfigureAwait(false);
}
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosTestStore.cs:49
- The 'shared' parameter in CreateName is unused. Either remove it or incorporate it into naming (if shared stores need stable names while non-shared need uniqueness), to avoid confusion about intended behavior.
private static string CreateName(string name, bool shared)
{
return !TestEnvironment.IsEmulator ? name + _runId : name;
}
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs:11
- Typo in field name '_aquired' (should be '_acquired') which deviates from the codebase naming conventions and makes the code harder to read/search.
private bool _aquired;
test/EFCore.Cosmos.FunctionalTests/TestUtilities/TestEnvironment.cs:36
- UseTokenCredential is parsed with a case-sensitive string comparison to "True". This will treat common values like "true"/"FALSE" (e.g., from env vars) as false; consider using bool.TryParse or a case-insensitive comparison to avoid misconfiguration.
public static bool UseTokenCredential { get; } = Config["UseTokenCredential"] == "True";
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs:16
- ContainerCrudSemaphore is created with a max count of int.MaxValue (new SemaphoreSlim(1)). Using new SemaphoreSlim(1, 1) helps catch accidental over-release and preserves mutual exclusion guarantees.
/// <summary>
/// The emulator becomes unstable with parallel CRUD container requests, use this lock to create containers.
/// </summary>
public static readonly SemaphoreSlim ContainerCrudSemaphore = new(1);
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosCloudTestStore.cs:9
- The SemaphoreSlim is constructed with only an initial count (max defaults to int.MaxValue). Consider specifying a maxCount (e.g., new SemaphoreSlim(Environment.ProcessorCount, Environment.ProcessorCount)) to catch accidental over-release bugs.
private static readonly SemaphoreSlim _semaphore = new(Environment.ProcessorCount);
test/EFCore.Cosmos.FunctionalTests/Storage/CosmosDatabaseCreatorTest.cs
Outdated
Show resolved
Hide resolved
test/EFCore.Cosmos.FunctionalTests/TestUtilities/CosmosEmulatorTestStore.cs
Show resolved
Hide resolved
| { | ||
| if (_initialized | ||
| && _dataFilePath == null) | ||
| if (_initialized && !Shared) |
There was a problem hiding this comment.
DisposeAsync no longer deletes databases for Shared test stores. For runs against a real Cosmos DB account this can leak databases/containers across test runs and incur cost/quota issues. Consider deleting the shared database when the fixture/test run completes (e.g., reference-count shared stores like CosmosEmulatorTestStore does, or otherwise ensure shared stores are cleaned up deterministically).
| if (_initialized && !Shared) | |
| if (_initialized) |
There was a problem hiding this comment.
Do we want a similar tracking mechanism for real cosmos db runs to delete the database if no longer used? I thought it would just be simpler to delete the cosmos account in the portal
|
Think I will wait for #37681 with this |
With some spare time, I've been experimenting around with the cosmos functional tests to see if I can get them to run in parallel. With an old attempt (See: #37696) being possibly unstable in the future or with a certain combination of tests, this PR aims to allow parallel test execution by manually controlling parallelism taking into account limitations of the emulator when needed.
The biggest challenge for parallel execution appeared to be the cosmos db emulator, which can become unstable in certain scenario's.
After some testing I found that the create container implementation in the emulator has a race condition, and can crash the emulator when ran in parallel. Other CRUD requests for containers appear to have no direct effect on the running of the emulator, but parallel requests can cause the emulator to have to clear the data upon restart.
In addition, the docs state that the emulator could show performance degradation with more than 10 concurrent containers.
On top of that, my tests have shown the emulator will stop responding to create container requests at some point when creating many containers. On my local machine this happens consistently after the creation of the 25th container.
This means that when running the tests against the emulator we can't execute CRUD container requests in parallel and ideally no more than 10 containers exist at the same time, and never more than 25.
This could be fixed by just decreasing the parallelism using
CollectionBehavior(MaxParallelThreads = N), but this is risky as it will cause errors or slow tests when many tests are added that create a lot of containers and those happen to run in parallel. See: #37696This PR implements manual parallelism control in CosmosEmulatorTestStore based on the number of containers each test store is attempting to create. This is both faster and prevents possible errors, but there are a couple of downsides, being mainly:
Some other challenges were:
There was a custom import for a cosmos specific Northwind database, which didn't appear to be needed. The custom import was very slow compared to using savechange's new batching, so I removed it and had to slightly change 1 test.
This PR decreases total test run time:
locally:
from 26 to 12.5 minutes
pipeline:
From ~30+ minutes to ~20 minutes total run time
Run against real cosmos db: Done