Is your feature request related to a problem?
EnvVarSecrets currently resolves secret values exclusively via os.environ, which is a process-global namespace. This makes it effectively impossible to safely host multiple tenants or multiple concurrent pipeline configurations within a single process — there is no way to supply per-request or per-tenant secret values without mutating the global environment, which introduces race conditions and security risks.
This is a meaningful gap for anyone running Haystack as a hosted service (e.g. a multi-pipeline API server, or any multi-tenant SaaS product built on Haystack).
Describe the solution you'd like
Two levels of improvement, ordered by implementation complexity:
- Quick win —
contextvars.ContextVar support
Augment EnvVarSecrets.resolve() to check a ContextVar registry before falling back to os.environ. Callers can then set per-coroutine/per-thread context before invoking a pipeline:
from haystack.core.secrets import env_var_context
token = env_var_context.set({"OPENAI_API_KEY": "sk-tenant-xyz"})
try:
pipeline.run(...)
finally:
env_var_context.reset(token)
This is low-overhead and safe for both asyncio and thread-pool concurrency models since ContextVar is isolated per task/thread context.
However ContextVar scopes are by far not easy to understand and manage especially when supporting both async and sync contexts, often found in asgi web frameworks such as in fastAPI. Proper setup for all edge-cases is not straightforward and comes with costs and caveats. Furthermore debugging can become a pain requiring deep insights into low-level frameworks such as anyio, or asyncio.
- Clean solution — first-class pipeline run context
Introduce a SecretsContext (or a general PipelineRunContext) object that can be passed directly to pipeline.run():
pipeline.run(
inputs={...},
context=PipelineRunContext(
secrets={"OPENAI_API_KEY": "sk-tenant-xyz"}
)
)
EnvVarSecrets (and any other secret resolver) would receive this context and prefer it over os.environ. This is the most ergonomic and explicit API, keeps secrets out of global state entirely, and makes the data flow auditable.
Downsides of this approach would be that some components would need to be rewritten, i.e. moving secret values resolution from init and warmup to run().
- Clean solution but less effort — first-class pipeline init context
Introduce a SecretsContext (or a general PipelineContext) object that can be passed directly to Pipeline():
pipeline = Pipeline(
context=PipelineRunContext(
secrets={"OPENAI_API_KEY": "sk-tenant-xyz"}
)
)
pipeline.run(inputs={...},)
EnvVarSecrets (and any other secret resolver) would receive this context and prefer it over os.environ. This is the best balance between ergonomic, explicit API, and keeping components as is (e.g. resolving secret values at component init and warmup()).
Describe alternatives you've considered
- Monkey-patching
os.environ per request — unsafe under concurrency.
- One process per tenant — operationally expensive, defeats the purpose of a shared runtime.
- Custom
Secret subclass per tenant — works today but requires callers to re-instantiate entire pipeline component graphs per tenant, which is wasteful.
Additional context
The key design constraint here is that EnvVarSecrets must remain the standard way to configure secrets on components — introducing a separate Secret type that tenants or hosting layers have to swap in would create a two-tier system where pipeline definitions need to differ between environments. That breaks the promise that a pipeline configured by one person can be handed off to a hosting service unchanged.
Put differently: where secrets come from at runtime should be a concern of the hosting layer, not the pipeline author. A pipeline configurator should be able to write EnvVarSecrets("OPENAI_API_KEY") and trust that whoever runs it will supply the value through whatever mechanism is appropriate — os.environ in local dev, a ContextVar or PipelineRunContext in a hosted multi-tenant service. The resolution strategy should be an invisible infrastructure detail, not something baked into the pipeline definition.
This also means the lookup order in EnvVarSecrets.resolve() should reflect that layering: pipeline run context → ContextVar → os.environ, so that more specific scopes naturally override broader ones without any action required from the pipeline author.
EnvVarSecrets is defined in haystack/core/secrets.py
- The
ContextVar approach requires no public API changes and could ship as a non-breaking minor addition
- The
PipelineRunContext approach would require a small signature change to Pipeline.run() but is fully backwards-compatible with a default of None
- This is a prerequisite for safely building multi-tenant hosted Haystack services without process-per-tenant isolation
Would you be willing to submit a PR? Yes / open to discussion on the preferred approach first.
Is your feature request related to a problem?
EnvVarSecretscurrently resolves secret values exclusively viaos.environ, which is a process-global namespace. This makes it effectively impossible to safely host multiple tenants or multiple concurrent pipeline configurations within a single process — there is no way to supply per-request or per-tenant secret values without mutating the global environment, which introduces race conditions and security risks.This is a meaningful gap for anyone running Haystack as a hosted service (e.g. a multi-pipeline API server, or any multi-tenant SaaS product built on Haystack).
Describe the solution you'd like
Two levels of improvement, ordered by implementation complexity:
contextvars.ContextVarsupportAugment
EnvVarSecrets.resolve()to check aContextVarregistry before falling back toos.environ. Callers can then set per-coroutine/per-thread context before invoking a pipeline:This is low-overhead and safe for both
asyncioand thread-pool concurrency models sinceContextVaris isolated per task/thread context.However ContextVar scopes are by far not easy to understand and manage especially when supporting both async and sync contexts, often found in asgi web frameworks such as in fastAPI. Proper setup for all edge-cases is not straightforward and comes with costs and caveats. Furthermore debugging can become a pain requiring deep insights into low-level frameworks such as anyio, or asyncio.
Introduce a
SecretsContext(or a generalPipelineRunContext) object that can be passed directly topipeline.run():EnvVarSecrets(and any other secret resolver) would receive this context and prefer it overos.environ. This is the most ergonomic and explicit API, keeps secrets out of global state entirely, and makes the data flow auditable.Downsides of this approach would be that some components would need to be rewritten, i.e. moving secret values resolution from init and warmup to run().
Introduce a
SecretsContext(or a generalPipelineContext) object that can be passed directly toPipeline():EnvVarSecrets(and any other secret resolver) would receive this context and prefer it overos.environ. This is the best balance between ergonomic, explicit API, and keeping components as is (e.g. resolving secret values at component init and warmup()).Describe alternatives you've considered
os.environper request — unsafe under concurrency.Secretsubclass per tenant — works today but requires callers to re-instantiate entire pipeline component graphs per tenant, which is wasteful.Additional context
The key design constraint here is that
EnvVarSecretsmust remain the standard way to configure secrets on components — introducing a separateSecrettype that tenants or hosting layers have to swap in would create a two-tier system where pipeline definitions need to differ between environments. That breaks the promise that a pipeline configured by one person can be handed off to a hosting service unchanged.Put differently: where secrets come from at runtime should be a concern of the hosting layer, not the pipeline author. A pipeline configurator should be able to write
EnvVarSecrets("OPENAI_API_KEY")and trust that whoever runs it will supply the value through whatever mechanism is appropriate —os.environin local dev, aContextVarorPipelineRunContextin a hosted multi-tenant service. The resolution strategy should be an invisible infrastructure detail, not something baked into the pipeline definition.This also means the lookup order in
EnvVarSecrets.resolve()should reflect that layering: pipeline run context →ContextVar→os.environ, so that more specific scopes naturally override broader ones without any action required from the pipeline author.EnvVarSecretsis defined inhaystack/core/secrets.pyContextVarapproach requires no public API changes and could ship as a non-breaking minor additionPipelineRunContextapproach would require a small signature change toPipeline.run()but is fully backwards-compatible with a default ofNoneWould you be willing to submit a PR? Yes / open to discussion on the preferred approach first.