Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 13 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,13 @@ README.md
- ExchangeRate API for live currency conversion
- yfinance for historical market-data retrieval and analytics

### Storage

- DuckDB — local analytical storage for normalized historical market data

>[!Note]
> See docs/storage.md for details.

---

## Planned / Future Tech Stack
Expand All @@ -138,40 +145,32 @@ Planned or likely future technologies include:
- Frankfurter API for historical FX data
- possible additional market-data APIs later

### Data processing

- possibly Polars later for larger datasets

### Storage

- PostgreSQL
- DuckDB
- Parquet
- optional cloud storage

### Visualization and UI

- NiceGUI
- Django

### DevOps and deployment

- Docker Compose
- cloud deployment later
- Travis CI

### Cloud and data engineering

- Azure, GCP or AWS depending on project direction
- Azure
- scheduled ingestion
- data quality checks
- reporting pipelines
- agentic Workflows
- Blob Storage
- scaled analysis

### AI and agentic workflows

- LLM-assisted summaries
- RAG over stored reports or notes
- agentic data checks
- anomaly monitoring
- human-in-the-loop signal review

> [!CAUTION]
> AI and agentic features are future-stage ideas.
Expand Down
Binary file added argus_probe.duckdb
Binary file not shown.
143 changes: 86 additions & 57 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,92 +27,121 @@ Scope:
Outcome:
Sprint 1 established the local ARGUS foundation with package structure, GUI prototype, analytics prototype, tests, documentation, CI, Dependabot and governance files.

### Sprint 2 — Market Analytics & Data Source Expansion
### Sprint 2 — Reporting & Market Analytics Foundation

**Status:** In progress

Move from simple FX conversion toward broader market analytics.
Move ARGUS from a simple FX-focused prototype toward a first usable market analytics and reporting tool.

Scope:
**Scope:**

- Add stronger market analytics metrics:

- Add stronger market metrics:
- cumulative return
- strongest / weakest day
- rolling volatility
- performance analytics
- risk analytics
- Extend the current dashboard without adding unnecessary chart noise
- Add or evaluate new data clients:
- Frankfurter for historical FX data
- basic performance analytics
- basic risk analytics
- Add or improve real market data support:

- yfinance for broader market data
- Replace or reduce dependency on the current ExchangeRate API where needed
- existing FX conversion remains available where useful
- Improve pandas-based analysis workflows
- Add tests for metric calculations and data transformations
- Document metric definitions, assumptions and chart behavior
- Introduce local storage for historical market data
- Add report generation and export
- Add a first simple prediction feature
- Introduce NiceGUI as the next GUI direction
- Extend the current dashboard with real market analytics
- Add tests for metric calculations, data transformations and storage behavior
- Improve CI/CD with first deployment or release automation steps

Outcome:
ARGUS becomes a basic market analytics tool, not only a converter.
**Outcome:**

ARGUS becomes a basic market analytics and reporting tool.
Users can fetch market data, store it locally, calculate metrics, generate a first report and view results through a first modern dashboard.

### Sprint 3 — Storage, Web-Ready UI & Data Architecture
---

### Sprint 3 — Advanced Local Analytics & Product Quality

**Status:** Planned

Prepare ARGUS for persistent data workflows and a stronger product interface.
Expand the local ARGUS application into a stronger analytics product with better data handling, UI structure, predictions and quality checks.

Scope:
**Scope:**

- Add local storage layer:
- PostgreSQL, DuckDB, SQLite or Parquet depending on use case
- Store historical market data
- Separate ingestion, transformation, analytics and presentation layers more clearly
- Start NiceGUI as the main web-ready UI direction
- Keep Tkinter as legacy/prototype unless still useful
- Keep CLI as internal/debug interface only
- Add clearer architecture documentation
- Prepare the project for larger data workflows and external contributors
- Extend the local storage layer
- Add a first local ETL workflow
- Improve the NiceGUI dashboard structure and usability
- Explore how NiceGUI can later interact with a more modern frontend stack such as Django, React or Node.js-based services
- Keep Tkinter as legacy/prototype unless it is no longer useful
- Add more metrics, instruments and prediction features
- Improve report templates and report structure
- Introduce first LLM-based summaries for generated reports
- Add first performance tests
- Introduce Snyk or another dependency/security scanning workflow
- Improve code quality, test coverage and maintainability

Outcome:
ARGUS has a clearer data architecture and starts moving from local prototype toward a scalable analytics application.
**Outcome:**

### Sprint 4 — Cloud, Pipelines & Portfolio-Grade Data Engineering
ARGUS becomes a more scalable local analytics application.
It can process more instruments, produce better reports, provide first automated summaries and offer more reliable insight into market data.

**Status:** Future
---

Turn ARGUS into a stronger end-to-end data engineering project.
### Sprint 4 — Extended Analysis & Cloud-Ready Foundation

Scope:
**Status:** Planned

- Docker / Docker Compose
- Scheduled data ingestion
- Cloud storage or cloud database
- CI/CD improvements
- Data quality checks
- Basic pipeline orchestration
- Reporting layer
- Architecture diagram
- Deployment documentation
Prepare ARGUS for deeper analysis, cloud interaction and future portfolio-assistant workflows while keeping the local product usable and transparent.

Target workflow:
**Scope:**

```text
API → Ingestion → Storage → Transformation → Analysis → Visualization → CI/CD
```
- Add Docker Compose for a more complete local development setup
- Introduce a first Azure connection, focused on simple storage or artifact exchange
- Improve the LLM workflow
- Introduce a first RAG-ready structure for reports, notes, documentation and stored analysis artifacts
- Add data quality checks
- Improve caching and efficient storage of market data
- Add more export options for users
- Add more metrics and better metadata visualization
- Improve transparency around data sources, generated reports and analysis assumptions
- Prepare clear interfaces for future cloud and assistant workflows

### Sprint 5 — AI-Assisted Research & Agentic Monitoring
**Outcome:**

**Status:** Future vision
ARGUS becomes ready to interact with a future cloud layer.
The application can produce clearer, more transparent market analysis and prepares the foundation for retrieval-based workflows, stronger automation and future ARGUS Core integration.

Add AI support only after the data, storage, service and reporting layers are stable.
---

Scope:
### Sprint 5 — Cloud Interaction & Agentic Monitoring Foundation

- LLM-assisted report summaries
- Explanation of unusual movements
- RAG over stored market notes, reports or documentation
- Agentic checks for data quality, anomalies and recurring market scans
- Human-in-the-loop signal review
- Automated monitoring workflows
**Status:** Planned

Outcome:
Start the first cloud-connected ARGUS workflows and introduce the foundation for monitoring, agentic checks and strategy-support features.

**Scope:**

- Add first cloud workflows that extend local analysis
- Connect local ARGUS workflows with the first cloud-side services
- Extend RAG over stored market notes, reports, documentation and analysis artifacts
- Add agentic checks for:

- data quality
- anomalies
- recurring market scans
- report consistency
- Add first human-in-the-loop review workflows for signals or strategy ideas
- Add automated monitoring workflows
- Prepare the first foundations for:

- paper trading
- backtesting
- controlled strategy evaluation
- future portfolio-assistant workflows

**Outcome:**

ARGUS starts behaving like its name: a system that continuously watches market data, evaluates it and helps generate useful signals.
ARGUS and the first cloud-side services begin to interact.
ARGUS becomes useful not only as an analytics and reporting tool, but also as the first foundation for monitoring, strategy evaluation and controlled market-research workflows.
154 changes: 154 additions & 0 deletions docs/storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# ARGUS Storage Layer

ARGUS uses DuckDB as the local storage layer for normalized market data.

The storage layer stores ARGUS-internal market data structures and provides reusable historical data for analytics, charts, dashboards and reports.

The storage design follows the direction described in [`docs/research-databases-and-storage.md`](research-databases-and-storage.md).

## Storage Workflow

ARGUS uses a storage-first workflow for historical market data.

```text
User / GUI / Analytics request
Market data service
Check DuckDB storage
If data exists:
read stored data
return it for analytics, charts or reports

If data is missing:
fetch data from a client/API
normalize the response into ARGUS-internal data
return the normalized data
save the normalized data in DuckDB
```

DuckDB is used to avoid unnecessary repeated API calls and to make historical market data reusable across analytics, dashboard and reporting workflows.

Fresh API data can be used immediately after normalization and is also persisted so future requests can use the local storage layer first.

## Schema Overview

The first storage schema is based on three related entities:

```text
data_sources
instruments
price_bars
```

### `data_sources`

Stores where market data came from.

Examples:

```text
yfinance
ExchangeRate API
Frankfurter
FRED
```

Each source describes a provider or API that can deliver market, FX or macro data.

### `instruments`

Stores what ARGUS can analyze.

Examples:

```text
EUR/USD
AAPL
SPY
BTC-USD
```

An instrument represents the internal ARGUS identity of an asset, currency pair, ETF, index or other market object.

Provider-specific symbols should be normalized before storage. For example:

```text
yfinance provider symbol: EURUSD=X
ARGUS instrument symbol: EUR/USD
```

### `price_bars`

Stores historical time-series values in an OHLCV-ready structure.

A price bar belongs to:

```text
one data source
one instrument
one timestamp
one timeframe
```

FX rates are stored as `close` values.

For simple FX data, the remaining OHLCV fields can stay empty. For broader market data, the same structure can store open, high, low, close, adjusted close and volume values.

The combination of source, instrument, timestamp and timeframe identifies a unique stored price bar.

## Internal Models and Storage

ARGUS uses internal domain models before data is stored:

```text
DataSource
Instrument
PriceBar
```

These models describe the meaning of the data inside ARGUS.

The storage layer translates these internal models into DuckDB tables:

```text
DataSource -> data_sources
Instrument -> instruments
PriceBar -> price_bars
```

In Python, a `PriceBar` references a `DataSource` and an `Instrument`.

In DuckDB, this relationship is stored through IDs:

```text
price_bars.source_id -> data_sources.id
price_bars.instrument_id -> instruments.id
```

This keeps the database normalized while still allowing ARGUS to work with meaningful internal models in Python.

## Reading Stored Data

Stored price bars can be read by:

```text
source
instrument
start date
end date
```

The storage layer joins `price_bars`, `data_sources` and `instruments` so that stored IDs become readable market data again.

Read operations return tabular data that can be used by:

```text
analytics
charts
dashboards
reports
```

This allows ARGUS to process stored historical data without depending on raw API response structures.
Loading
Loading