Full documentation at psxdata.readthedocs.io
psxdata is a Python library for downloading Pakistan Stock Exchange (PSX) data — historical OHLCV prices, real-time quotes, KSE-100 index constituents, sector summaries, fundamentals, debt market instruments, and margin-eligible stocks. Free, open-source, and actively maintained.
Alpha release note
0.1.0a1 — Core scraping, caching, and public API are complete. The FastAPI REST layer and full documentation are in progress. APIs may change before 1.0.
pip install psxdataRequires Python 3.11+.
import psxdata
# Historical OHLCV data
df = psxdata.stocks("ENGRO", start="2024-01-01", end="2024-12-31")
# All listed tickers
all_tickers = psxdata.tickers()
# KSE-100 index constituents
kse100 = psxdata.indices("KSE100")
# Live quote
q = psxdata.quote("LUCK")
# Sector summary
sectors = psxdata.sectors()
# Debt market instruments
debt = psxdata.debt_market()
# Margin-eligible stocks
scrips = psxdata.eligible_scrips()| Function | Description |
|---|---|
psxdata.stocks(symbol, start, end) |
Historical OHLCV DataFrame for a ticker |
psxdata.tickers() |
All listed tickers (1000+) |
psxdata.quote(symbol) |
Live quote row for a ticker |
psxdata.indices(name) |
Constituents of a named index (e.g. "KSE100") |
psxdata.sectors() |
Sector aggregates DataFrame (37 sectors) |
psxdata.fundamentals(symbol) |
Financial reports for a ticker |
psxdata.debt_market() |
Debt market instruments (TFCs, Sukuks, etc.) |
psxdata.eligible_scrips() |
Margin trading eligible stocks |
Existing solutions for PSX data tend to hardcode date formats and column positions that break silently when PSX changes its HTML. psxdata is designed differently:
- Dynamic column extraction from
<th>tags — survives column reordering - Multi-format date parsing with fuzzy fallback via
dateutil - Exponential backoff retries — 3 attempts, 1s/2s delays
- Disk cache (
~/.psxdata/cache/) — historical data cached forever, live data for 15 min - Data validation — OHLC constraint checks, duplicate/future date detection
The FastAPI layer is planned for Phase 4.
| Endpoint | Description |
|---|---|
GET /health |
Health check |
GET /stocks |
All tickers |
GET /stocks/{symbol}/historical?start=&end= |
Historical OHLCV |
GET /stocks/{symbol}/quote |
Real-time quote |
GET /stocks/{symbol}/fundamentals |
Fundamentals |
GET /indices/{name} |
Index constituents |
GET /sectors |
Sector aggregates |
GET /debt-market |
Debt instruments |
GET /eligible-scrips |
Margin eligible stocks |
All responses: {"data": ..., "meta": {"timestamp": "...", "cached": bool}} — list responses also include "count": N in meta.
Errors: {"error": {"status": 404, "code": "not_found", "message": "..."}}
See the roadmap issue for the full phase breakdown.
- ✅ Phase 0 — PSX endpoint research and HTML fixture capture
- ✅ Phase 0.5 — Repository setup, CI/CD, community files
- ✅ Phase 2 — Core engineering (BaseScraper, parsers, cache, utils)
- ✅ Phase 3 — Scrapers (historical, real-time, indices, sectors, fundamentals, screener, debt, eligible scrips)
- ✅ Phase 3 API — Public Python package interface
- 🔲 Phase 4 — FastAPI REST layer
- 🔲 Phase 5 — Full test suite (API layer tests pending)
- ✅ Phase 6 — Packaging & PyPI publish
- 🔲 Phase 7 — Documentation
Contributions are welcome. See CONTRIBUTING.md and open an issue before starting non-trivial work.
See ARCHITECTURE.md for the component diagram, data flow, and design decisions.