Feat/alpaca data provider v1#24
Open
Alejandro-Duenas wants to merge 13 commits into
Open
Conversation
- Add AlpacaDataProvider with two-header auth on sync and async sessions - Map daily/hourly/minute frequencies and aliases to Alpaca timeframes - Declare crypto/intraday capabilities and conform to OHLCVProvider
- Add sync and async single-page stock bar fetch with per-status error mapping - Inspect status codes directly so 401/403/429/404/500 surface as typed errors - Transform bars into the canonical OHLCV schema, tolerant of list and dict shapes - Forward the configured data feed in request params
- Resolve asset class from a BASE/QUOTE slash or an explicit override - Route crypto symbols to the v1beta3 crypto bars endpoint, omitting feed - Preserve the BASE/QUOTE symbol verbatim in params and the symbol column - Fully override fetch_ohlcv/fetch_ohlcv_async to thread asset_class while keeping input validation, circuit breaker, and OHLCV validation - Share one bars-to-DataFrame helper across the stock and crypto branches
- Loop each fetch until next_page_token is null, merging per-page bars - Extend the stock bar list and concatenate crypto bars per symbol - Send a fresh params dict per page so the token never rewrites earlier requests - Acquire one rate-limit token per page, preserving the once-per-fetch guarantee
Wire AlpacaDataProvider into the provider registry, env auto-detection, config env mapping, public exports, the config-validation warning list, and the provider catalog tables so provider="alpaca" resolves end-to-end.
- Expand module docstring with API URL, feed/tier, env vars, rate limit, sync and async examples - Add skipped-by-default integration test fetching AAPL daily and BTC/USD minute - Update provider registration test to include the registered Alpaca provider Task: SG-6
- Accept RFC-3339 datetime bounds and validate explicit asset_class - Retry per pagination page, honoring 429 rate-limit headers - Run async fetches inside the circuit breaker via new call_async - Uppercase crypto symbols and expose a stock adjustment option
- Strip api_secret from the sanitized provider info config - Mark alpaca available only when both key and secret resolve - Accept APCA_* SDK env aliases in config mapping and autodetect - Warn when an alpaca config lacks api_secret
- Add fetch_ohlcv_async, per-page retry, and rate-limit header tests - Cover circuit breaker call_async, lazy init, and registry caching - Pin two-credential availability and secret redaction behavior - Dedupe the provider fixture and mock page-response builder
- Add docs/providers/alpaca.md plus README, mkdocs, and env-var entries - Correct the IEX delay claim and document the raw adjustment default - Refresh provider counts and catalog line counts
- Declare the ml4t namespace known-first-party so its imports form a third section after stdlib and third-party blocks - Drop stale import-section comments splitting the ml4t block in examples
- Add coverage output and local working dirs to gitignore
- Map 5m/15m/30m and their Nminute aliases to Alpaca timeframes - Add parametrized mapping tests and a live 15m integration test - Document the new frequencies and uniform aliases in provider docs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds an Alpaca Market Data provider supporting US equities and crypto OHLCV bars (daily, hourly, and 1/5/15/30-minute) over Alpaca's historical REST API, with full sync and async paths. One provider serves both asset classes: plain tickers route to the stock bars endpoint,
BASE/QUOTEsymbols to the crypto endpoint, withnext_page_tokenpagination, per-page retry, and circuit-breaker protection on both transports. Changed-code branch coverage is 100% (2,969 tests green, including the-W error::ResourceWarningleak lane).Provider implementation:
providers/alpaca.py: NewAlpacaDataProvider(two-credential header auth,feediex/sip selection, configurable priceadjustmentdefaulting to Alpaca's raw). AcceptsYYYY-MM-DDor RFC-3339 datetime bounds — datetime bounds make sub-day minute/hour windows possible. Retries transient failures per pagination page (a failure on page N never refetches earlier pages) and derivesretry_afterfrom 429Retry-After/X-RateLimit-Resetheaders. Validates explicitasset_classvalues and uppercases symbols into requests and output, preserving the crypto slash. Frequencies cover daily, hourly, and 1/5/15/30-minute bars (15m/15minute-style aliases), making multi-year intraday backfills possible without client-side resampling.Resilience infrastructure:
providers/mixins/circuit_breaker.py: AddedCircuitBreaker.call_asyncand_with_circuit_breaker_async, giving async fetch paths the same state handling and failure accounting as sync ones. The Alpaca async path runs its entire fetch/transform/validate pipeline inside the breaker — an open breaker refuses before any request goes out.Registration and configuration:
managers/provider_manager.py: Registered Alpaca as a keyed provider; env autodetection requires both credentials (a key without a secret is not constructable, so it is not reported available); added aSECRET_FIELDSredaction set soget_provider_infostripsapi_secretalongsideapi_key.managers/config_manager.py: MappedALPACA_API_KEY/ALPACA_API_SECRETplus Alpaca's own SDK names (APCA_API_KEY_ID/APCA_API_SECRET_KEY) into provider config, with the project-convention names winning on conflict.config/models.py,config/validator.py: Added thealpacaprovider type; the validator warns when an Alpaca entry lacks an API key or secret.Testing:
tests/test_alpaca_provider.py: Offline unit suite covering frequency mapping for every supported key and alias, both endpoints and response shapes, pagination and token threading, the full async public path, HTTP error mapping (401/404/429/5xx), transport-failure wrapping with cause preservation, retry counts and fail-fast behavior, breaker accounting, credential autodetection, and secret redaction.tests/integration/test_alpaca.py: Credential-gated smoke tests (integrationmarker, skipped by default) for daily stock, 15-minute stock, and minute crypto fetches.tests/test_base_provider_enhanced.py: Coverage forCircuitBreaker.call_asyncsuccess, failure accounting, and half-open recovery.tests/test_provider_registration.py: Registry expectations extended to 12 providers (Alpaca alongside main's new Massive/Polygon entries).Documentation:
docs/providers/alpaca.md: New provider page (pricing, symbol format, frequencies, feed/adjustment semantics, key setup, rate limits).docs/providers/README.md, rootREADME.md,mkdocs.yml,docs/INTEGRATION_TESTING.md: Added Alpaca to the provider tables, nav, env-var examples, and CI secrets list.Tooling:
pyproject.toml: Declared theml4tnamespaceknown-first-partyfor ruff's isort so its imports always form their own section after stdlib and third-party blocks; re-sorted the two example scripts accordingly..gitignore: Ignore local coverage output and planning/working directories.