Skip to content

mtauha/psxdata

psxdata — Python Library for Pakistan Stock Exchange (PSX) Data

CI PyPI Documentation Python License: MIT

Full documentation at psxdata.readthedocs.io

psxdata is a Python library for downloading Pakistan Stock Exchange (PSX) data — historical OHLCV prices, real-time quotes, KSE-100 index constituents, sector summaries, fundamentals, debt market instruments, and margin-eligible stocks. Free, open-source, and actively maintained.

Alpha release note

0.1.0a1 — Core scraping, caching, and public API are complete. The FastAPI REST layer and full documentation are in progress. APIs may change before 1.0.


Installation

pip install psxdata

Requires Python 3.11+.


Quick Start

import psxdata

# Historical OHLCV data
df = psxdata.stocks("ENGRO", start="2024-01-01", end="2024-12-31")

# All listed tickers
all_tickers = psxdata.tickers()

# KSE-100 index constituents
kse100 = psxdata.indices("KSE100")

# Live quote
q = psxdata.quote("LUCK")

# Sector summary
sectors = psxdata.sectors()

# Debt market instruments
debt = psxdata.debt_market()

# Margin-eligible stocks
scrips = psxdata.eligible_scrips()

API Reference

Function Description
psxdata.stocks(symbol, start, end) Historical OHLCV DataFrame for a ticker
psxdata.tickers() All listed tickers (1000+)
psxdata.quote(symbol) Live quote row for a ticker
psxdata.indices(name) Constituents of a named index (e.g. "KSE100")
psxdata.sectors() Sector aggregates DataFrame (37 sectors)
psxdata.fundamentals(symbol) Financial reports for a ticker
psxdata.debt_market() Debt market instruments (TFCs, Sukuks, etc.)
psxdata.eligible_scrips() Margin trading eligible stocks

Why psxdata

Existing solutions for PSX data tend to hardcode date formats and column positions that break silently when PSX changes its HTML. psxdata is designed differently:

  • Dynamic column extraction from <th> tags — survives column reordering
  • Multi-format date parsing with fuzzy fallback via dateutil
  • Exponential backoff retries — 3 attempts, 1s/2s delays
  • Disk cache (~/.psxdata/cache/) — historical data cached forever, live data for 15 min
  • Data validation — OHLC constraint checks, duplicate/future date detection

Planned REST API

The FastAPI layer is planned for Phase 4.

Endpoint Description
GET /health Health check
GET /stocks All tickers
GET /stocks/{symbol}/historical?start=&end= Historical OHLCV
GET /stocks/{symbol}/quote Real-time quote
GET /stocks/{symbol}/fundamentals Fundamentals
GET /indices/{name} Index constituents
GET /sectors Sector aggregates
GET /debt-market Debt instruments
GET /eligible-scrips Margin eligible stocks

All responses: {"data": ..., "meta": {"timestamp": "...", "cached": bool}} — list responses also include "count": N in meta.

Errors: {"error": {"status": 404, "code": "not_found", "message": "..."}}


Development Status

See the roadmap issue for the full phase breakdown.

  • ✅ Phase 0 — PSX endpoint research and HTML fixture capture
  • ✅ Phase 0.5 — Repository setup, CI/CD, community files
  • ✅ Phase 2 — Core engineering (BaseScraper, parsers, cache, utils)
  • ✅ Phase 3 — Scrapers (historical, real-time, indices, sectors, fundamentals, screener, debt, eligible scrips)
  • ✅ Phase 3 API — Public Python package interface
  • 🔲 Phase 4 — FastAPI REST layer
  • 🔲 Phase 5 — Full test suite (API layer tests pending)
  • ✅ Phase 6 — Packaging & PyPI publish
  • 🔲 Phase 7 — Documentation

Contributing

Contributions are welcome. See CONTRIBUTING.md and open an issue before starting non-trivial work.


Architecture

See ARCHITECTURE.md for the component diagram, data flow, and design decisions.

About

Python library and REST API for free Pakistan Stock Exchange (PSX) data

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors