Skip to content

Commit 693be80

Browse files
committed
2026-03-04 docs: benchmark OWA 0.877 sync, experiment/test code links, version 0.0.12
1 parent faeef3a commit 693be80

8 files changed

Lines changed: 157 additions & 14 deletions

File tree

docs/benchmarks.ko.md

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Vectrix는 **M3 4개 카테고리 전부**에서 Naive2를 능가하며, M3 Mont
6161
| 항목 | 버전 / 사양 |
6262
|------|-------------|
6363
| Python | 3.10+ |
64-
| Vectrix | 0.0.10 |
64+
| Vectrix | 0.0.12 |
6565
| OS | Windows 11 / Ubuntu 22.04 / macOS 14+ |
6666
| CPU | x86_64 또는 ARM64 |
6767
| RAM | 8 GB 이상 |
@@ -72,6 +72,38 @@ Vectrix는 **M3 4개 카테고리 전부**에서 Naive2를 능가하며, M3 Mont
7272
pip install vectrix
7373
```
7474

75-
M4 벤치마크 실험 스크립트: `src/vectrix/experiments/modelCreation/019_dotHybridEngine.py`
75+
### 실험 코드
76+
77+
모든 실험은 완전히 재현 가능한 Python 스크립트이며, 결과는 docstring에 기록되어 있습니다.
78+
79+
| 실험 | 설명 | 소스 |
80+
|:-----|:-----|:-----|
81+
| E019 | DOT-Hybrid 엔진 M4 100K 검증 | [019_dotHybridEngine.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/019_dotHybridEngine.py) |
82+
| E042 | M4 공식 OWA 검증 | [042_m4OfficialOwa.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/042_m4OfficialOwa.py) |
83+
| E043 | Holdout validation + auto period detection | [043_dotAutoPeriodHoldout.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/043_dotAutoPeriodHoldout.py) |
84+
| E044 | Daily/Weekly 전문화 전략 | [044_dailyWeeklySpecialist.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/044_dailyWeeklySpecialist.py) |
85+
| E045 | 통합 개선 검증 | [045_integratedImprovement.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/045_integratedImprovement.py) |
86+
| E046 | 최종 통합 규칙 검증 | [046_finalIntegration.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/046_finalIntegration.py) |
87+
88+
전체 실험 현황 및 연구 노트: [STATUS.md](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/STATUS.md)
89+
90+
### 테스트
91+
92+
573개 테스트, 5개 skip — 모든 엔진, 모델, 파이프라인 컴포넌트 커버.
93+
94+
```bash
95+
pip install vectrix
96+
pytest tests/ -x -q
97+
```
98+
99+
| 테스트 모듈 | 개수 | 범위 |
100+
|:------------|:----:|:-----|
101+
| [test_all_models.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_all_models.py) | 112 | 30+ 예측 모델 전체 |
102+
| [test_new_models.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_new_models.py) | 45 | DTSF, ESN, 4Theta 엔진 |
103+
| [test_engine_utils.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_engine_utils.py) | 55 | ARIMAX, CV, 분해 |
104+
| [test_easy.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_easy.py) | 33 | Easy API (forecast, analyze, regress) |
105+
| [test_business.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_business.py) | 45 | 이상치, 백테스트, 메트릭, 시나리오 |
106+
| [test_adaptive.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_adaptive.py) | 20 | 레짐, DNA, 자가치유, 제약 |
107+
| [test_regression.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_regression.py) | 22 | OLS, Ridge, Lasso, 진단 |
76108

77109
M4 데이터 파일은 [M4 Competition 저장소](https://github.com/Mcompetitions/M4-methods)에서 다운로드할 수 있습니다.

docs/benchmarks.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,38 @@ Vectrix consistently outperforms Naive2 across all M3 categories, with the stron
6060
pip install vectrix
6161
```
6262

63-
M4 benchmark experiment: `src/vectrix/experiments/modelCreation/019_dotHybridEngine.py`
63+
### Experiment Code
64+
65+
All experiments are fully reproducible Python scripts with results recorded in docstrings.
66+
67+
| Experiment | Description | Source |
68+
|:-----------|:------------|:-------|
69+
| E019 | DOT-Hybrid engine M4 100K verification | [019_dotHybridEngine.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/019_dotHybridEngine.py) |
70+
| E042 | M4 official OWA verification | [042_m4OfficialOwa.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/042_m4OfficialOwa.py) |
71+
| E043 | Holdout validation + auto period detection | [043_dotAutoPeriodHoldout.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/043_dotAutoPeriodHoldout.py) |
72+
| E044 | Daily/Weekly specialist strategies | [044_dailyWeeklySpecialist.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/044_dailyWeeklySpecialist.py) |
73+
| E045 | Integrated improvement verification | [045_integratedImprovement.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/045_integratedImprovement.py) |
74+
| E046 | Final integration rule validation | [046_finalIntegration.py](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/046_finalIntegration.py) |
75+
76+
Full experiment status and research notes: [STATUS.md](https://github.com/eddmpython/vectrix/blob/master/src/vectrix/experiments/modelCreation/STATUS.md)
77+
78+
### Test Suite
79+
80+
573 tests, 5 skipped — covering all engines, models, and pipeline components.
81+
82+
```bash
83+
pip install vectrix
84+
pytest tests/ -x -q
85+
```
86+
87+
| Test Module | Count | Coverage |
88+
|:------------|:-----:|:---------|
89+
| [test_all_models.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_all_models.py) | 112 | All 30+ forecasting models |
90+
| [test_new_models.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_new_models.py) | 45 | DTSF, ESN, 4Theta engines |
91+
| [test_engine_utils.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_engine_utils.py) | 55 | ARIMAX, CV, decomposition |
92+
| [test_easy.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_easy.py) | 33 | Easy API (forecast, analyze, regress) |
93+
| [test_business.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_business.py) | 45 | Anomaly, backtesting, metrics, scenarios |
94+
| [test_adaptive.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_adaptive.py) | 20 | Regime, DNA, self-healing, constraints |
95+
| [test_regression.py](https://github.com/eddmpython/vectrix/blob/master/tests/test_regression.py) | 22 | OLS, Ridge, Lasso, diagnostics |
6496

6597
> **Tip:** For faster M4 data loading, download the CSV files directly from the [M4 Competition repository](https://github.com/Mcompetitions/M4-methods) rather than using `M4.load()`, which can be slow due to wide-to-long data transformation.

landing/src/lib/components/sections/Benchmarks.svelte

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
55
const rows = [
66
{ comp: 'M3', yearly: '0.848', quarterly: '0.825', monthly: '0.758', weekly: '', daily: '', hourly: '0.819' },
7-
{ comp: 'M4', yearly: '0.974', quarterly: '0.797', monthly: '0.987', weekly: '0.737', daily: '1.207', hourly: '1.006' },
8-
{ comp: 'M4 Ensemble', yearly: '0.879', quarterly: '0.797', monthly: '0.927', weekly: '0.737', daily: '1.105', hourly: '0.696' }
7+
{ comp: 'M4 DOT-Hybrid', yearly: '0.797', quarterly: '0.894', monthly: '0.897', weekly: '0.959', daily: '0.996', hourly: '0.722' },
8+
{ comp: 'M4 VX-Ensemble', yearly: '0.879', quarterly: '0.907', monthly: '0.919', weekly: '0.954', daily: '0.996', hourly: '0.696' }
99
];
1010
1111
function cellClass(val: string): string {
@@ -50,7 +50,7 @@
5050
</div>
5151

5252
<p class="mt-3 text-xs text-vx-text-dim">
53-
M4 Ensemble uses VX-Ensemble with DOT + AutoCES + 4Theta + DTSF + ESN. Hourly 0.696 OWA = competition winner level.
53+
DOT-Hybrid: single model, AVG OWA 0.877. VX-Ensemble: DOT + AutoCES + 4Theta. Hourly 0.696 OWA = competition winner level.
5454
</p>
5555
<Button variant="secondary" size="sm" href="{base}/docs/benchmarks" class="mt-6">
5656
View full benchmark results →

llms-full.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> Zero-config time series forecasting library for Python. 30+ statistical models with automatic selection, built-in Rust engine (25 accelerated functions), adaptive intelligence, full regression suite, and business analytics.
44

5-
- Version: 0.0.8
5+
- Version: 0.0.12
66
- Python: >=3.10
77
- Core deps: numpy>=1.24, pandas>=2.0, scipy>=1.10
88
- Install: `pip install vectrix`
@@ -188,7 +188,7 @@ from vectrix.engine.baseline import NaiveModel, SeasonalNaiveModel, MeanModel, R
188188
| AutoETS | AutoETS | General purpose, trend+seasonal |
189189
| AutoARIMA | AutoARIMA | Stationary with complex autocorrelation |
190190
| Theta | OptimizedTheta | Simple trend extrapolation |
191-
| DOT | DynamicOptimizedTheta | General purpose (M4 OWA 0.905) |
191+
| DOT | DynamicOptimizedTheta | General purpose (M4 OWA 0.877) |
192192
| AutoCES | AutoCES | General purpose (M4 OWA 0.927) |
193193
| AutoMSTL | AutoMSTL | Multiple seasonality (daily, weekly, yearly) |
194194
| AutoTBATS | AutoTBATS | Complex multi-seasonal |
@@ -524,7 +524,7 @@ print(TURBO_AVAILABLE) # True if Rust engine loaded (default for all pip instal
524524
```
525525

526526
### Performance benchmarks (M4 Competition)
527-
- DOT: OWA 0.905 (general purpose best)
527+
- DOT: OWA 0.877 (general purpose best)
528528
- AutoCES: OWA 0.927
529529
- 4Theta Yearly: OWA 0.879 (= M4 official #11)
530530
- VX-Ensemble Hourly: OWA 0.696 (winner-level)

llms.txt

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
> Zero-config time series forecasting library for Python. 30+ statistical models, automatic selection, built-in Rust engine. NumPy/SciPy/Pandas with adaptive intelligence, regression, and business analytics.
44

5-
- Version: 0.0.8
5+
- Version: 0.0.12
66
- Python: 3.10+
77
- Dependencies: numpy, pandas, scipy (core only)
88
- Install: `pip install vectrix`
@@ -35,10 +35,26 @@ print(reg.summary())
3535
- [Installation](https://eddmpython.github.io/vectrix/docs/getting-started/installation/): Setup guide (Rust engine built-in)
3636
- [Quickstart](https://eddmpython.github.io/vectrix/docs/getting-started/quickstart/): 5-minute tutorial
3737

38+
## Benchmarks
39+
40+
M4 Competition 100,000 time series (DOT-Hybrid single model):
41+
42+
| Frequency | OWA |
43+
|-----------|-----|
44+
| Yearly | 0.797 |
45+
| Quarterly | 0.894 |
46+
| Monthly | 0.897 |
47+
| Weekly | 0.959 |
48+
| Daily | 0.996 |
49+
| Hourly | 0.722 |
50+
| **AVG** | **0.877** |
51+
52+
Beats M4 #18 Theta (0.897). Full results: [benchmarks](https://eddmpython.github.io/vectrix/docs/benchmarks/)
53+
3854
## API Reference
3955

4056
- [Forecasting Guide](https://eddmpython.github.io/vectrix/docs/guide/forecasting/): Detailed forecasting workflows
41-
- [Model Catalog](https://eddmpython.github.io/vectrix/docs/guide/models/): All 30+ models with parameters
57+
- [Analysis & DNA](https://eddmpython.github.io/vectrix/docs/guide/analysis/): Time series profiling, 65+ features
4258
- [Regression Guide](https://eddmpython.github.io/vectrix/docs/guide/regression/): OLS, Ridge, Lasso, Huber, Quantile
4359
- [Adaptive Intelligence](https://eddmpython.github.io/vectrix/docs/guide/adaptive/): Regime detection, self-healing, DNA
4460
- [Business Analytics](https://eddmpython.github.io/vectrix/docs/guide/business/): Anomaly, scenarios, backtesting

src/vectrix/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@
8282
)
8383
from .vectrix import Vectrix
8484

85-
__version__ = "0.0.11"
85+
__version__ = "0.0.12"
8686
__all__ = [
8787
"Vectrix",
8888
"ForecastResult",

src/vectrix/experiments/modelCreation/STATUS.md

Lines changed: 64 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@
241241
- **결론**: 앙상블 자체가 DOT-only보다 나쁨 (DOT가 이미 최적화)
242242

243243
### E031-E040 종합 결론
244-
1. **DOT-Hybrid (0.885)는 순수 통계 모델의 실질적 한계**
244+
1. **DOT-Hybrid (0.877, holdout 적용 후)는 순수 통계 모델의 실질적 한계**
245245
2. **메타러닝 최고 = 0.873** (scikit-learn 필요, 현재 미반영)
246246
3. **앙상블은 DOT-only보다 나쁨** — DOT가 이미 충분히 최적화
247247
4. **M4 #1 (0.821) 달성에는 DL 하이브리드 필수**
@@ -299,8 +299,71 @@
299299
- **정직한 위치**: M4 공식 기준 약 14~15위 (Theta 0.897보다는 우수)
300300
- 주의: 11K 샘플 기준, 100K 전체에서는 Monthly(48K) 비중 증가로 약간 달라질 수 있음
301301

302+
## 043~046: DOT Holdout Validation 실험 (2026-03-04)
303+
304+
### 043 DOT Auto Period Detection + Holdout Validation
305+
306+
| 변형 | Yearly | Quarterly | Monthly | Weekly | Daily | Hourly | **AVG** |
307+
|------|--------|-----------|---------|--------|-------|--------|---------|
308+
| baseline | 0.7971 | 0.9053 | 0.9200 | 0.9587 | 0.9949 | 0.7223 | **0.8831** |
309+
| auto_period | 0.8019 | 0.9053 | 0.9200 | 0.9952 | 1.0220 | 0.7223 | **0.8944** |
310+
| **holdout_val** | 0.8064 | **0.8940** | **0.8965** | **0.9457** | 0.9918 | 0.7223 | **0.8761** |
311+
| combined | 0.8084 | 0.8940 | 0.8965 | 0.9831 | 1.0187 | 0.7223 | **0.8872** |
312+
313+
- **auto_period: 기각** — ACF가 노이즈에서 가짜 단주기(2,3) 감지, Daily +2.7%, Weekly +3.8% 악화
314+
- **holdout_val: 조건부 채택** — Quarterly -1.25%, Monthly -2.55% 개선, Yearly +1.2% 회귀(데이터 축소)
315+
- **combined: 기각** — auto_period가 holdout 이점을 상쇄
316+
317+
### 044 Daily/Weekly Specialist
318+
319+
- **Weekly classic_only: 채택** (-2.18%) — period=1에서 classic DOT가 Hybrid보다 우수
320+
- **Daily classic_only: 기각** (+0.98%)
321+
- **Core3 앙상블 Daily/Weekly: 기각** (+21%/+8%) — CES/4Theta가 period=1에서 해로움
322+
323+
### 045 Integrated Improvement (holdout + Weekly classic)
324+
325+
- **AVG 0.8831→0.8748 (-0.94%)** — 전반적 개선
326+
- **Yearly +1.16% 회귀** — holdout으로 인한 짧은 시리즈 데이터 축소 문제
327+
328+
### 046 Final Integration (period별 분리)
329+
330+
- **period<=1 classic + period>1 holdout: 기각** — Yearly +11.26% 치명적 회귀!
331+
- **핵심 발견**: Yearly(period=1)는 Hybrid 8-way가 trend 탐색에 유리, classic 적용 불가
332+
- **최종 규칙**: period>1에서만 holdout validation 적용 (Quarterly/Monthly만 개선)
333+
334+
### E043-E046 종합 결론
335+
1. **holdout validation은 period>1 계절성 데이터에서만 유효** (Quarterly -1.25%, Monthly -2.55%)
336+
2. **ACF auto period detection은 해로움** — 노이즈에서 가짜 주기 감지
337+
3. **period=1 데이터는 건드리지 않는 것이 안전** — Yearly/Daily/Weekly 모두 기존 방식 유지
338+
4. **Core3 앙상블은 period=1에서 해로움** — CES/4Theta가 비계절성 데이터에서 약함
339+
340+
### dot.py 반영 사항 (v0.0.12)
341+
- `_fitHybrid()`: `period > 1 and n >= period * 4`일 때만 holdout validation
342+
- `_predictVariantSteps()` 헬퍼 메서드 추가
343+
- holdout 후 전체 데이터로 refit
344+
- **DOT-Hybrid AVG OWA: 0.885 → 0.877** (period>1만 개선, 나머지 unchanged)
345+
- 테스트: 573 passed, 5 skipped
346+
347+
## 완료된 단계
348+
- [x] 3개 모델 engine/ 모듈화 (fit/predict/residuals 인터페이스)
349+
- [x] types.py에 모델 정보 등록
350+
- [x] vectrix.py _selectNativeModels에 새 모델 반영
351+
- [x] 기존 테스트 573개 통과 확인
352+
- [x] 012 M4 100K 벤치마크 완료
353+
- [x] 013~015 세상에 없던 새 앙상블/예측 원리 3개 실험 (전부 기각)
354+
- [x] 016~018 DOT 강화 + SCUM 실험 완료
355+
- [x] DOT-Hybrid를 engine/dot.py에 통합 (period<24: DOT++, period>=24: classic)
356+
- [x] Rust dot_hybrid_objective 추가 (26번째 함수)
357+
- [x] 019 통합 엔진 M4 100K 검증 완료 (OWA 0.885)
358+
- [x] 031~040 FFORMA 메타러닝 + 모델 선택 최적화 10개 실험 완료
359+
- [x] auto_arima 기본 풀 제거 반영
360+
- [x] 041 조건부 앙상블 검증 → core3 우선 앙상블 엔진 반영 (AVG 0.885→0.879)
361+
- [x] 042 M4 공식 OWA 검증 → 벤치마크 방법론 문제 발견
362+
- [x] 043~046 DOT holdout validation 실험 → period>1 holdout 엔진 반영 (AVG 0.885→0.877)
363+
302364
## 다음 단계
303365
- [ ] DL 하이브리드 (NeuralForecast/TimesFM) 탐색 → M4 #1 (0.821) 도전
366+
- [ ] Daily OWA 0.996 개선 (period=1 비계절성 데이터 전략)
304367
- [ ] 4Theta seasonality 처리 개선 (Quarterly/Monthly/Weekly/Daily 약세)
305368
- [ ] DTSF 단기 시리즈 성능 개선 (n<100에서 약세)
306369
- [ ] ESN reservoir 크기 자동 조정 (긴 시리즈에서 느림)

src/vectrix/vectrix.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ class Vectrix:
6969
Dependencies: numpy, pandas, scipy (required), numba (optional)
7070
"""
7171

72-
VERSION = "0.0.11"
72+
VERSION = "0.0.12"
7373

7474
NATIVE_MODELS = {
7575
'auto_ets': {

0 commit comments

Comments
 (0)