Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
04bd263
Survey Phase 7: CS IPW/DR covariates, repeated cross-sections, Honest…
igerber Mar 28, 2026
4bf566d
Address CI review: RCS IF corrections, aggregation weights, replicate…
igerber Mar 29, 2026
6080f92
Fix DR RC normalizer mismatch, holistic RCS cohort-mass weighting, un…
igerber Mar 29, 2026
b623dee
Rewrite RC reg/DR to match DRDID::reg_did_rc and DRDID::drdid_rc form…
igerber Mar 29, 2026
3b405b7
Fix bootstrap RCS cohort-mass weighting, reset stale event-study VCV
igerber Mar 29, 2026
53cfd5d
Clear analytical event_study_vcov when bootstrap overwrites event-stu…
igerber Mar 29, 2026
9ff21a2
Fix RC IF normalization scaling: M1 uses n_all denominator, PS M2 use…
igerber Mar 29, 2026
c2f8fdc
Document RCS IF phi=psi/n convention, add analytical-vs-bootstrap SE …
igerber Mar 29, 2026
cb3f815
Refactor RC IFs to R's psi convention, fix HonestDiD VCV subsetting
igerber Mar 29, 2026
7e127fb
Resolve merge conflict, match R colMeans convention in panel IPW/DR M…
igerber Mar 29, 2026
eac680e
Match R's H/n, asy_rep/n, colMeans convention for panel PS correction…
igerber Mar 29, 2026
9893454
Fix VCV index alignment, add stationarity warning for panel=False
igerber Mar 29, 2026
4415034
Document panel DR control-augmentation normalization deviation from D…
igerber Mar 29, 2026
1c35440
Warn on non-universal base period in HonestDiD CS path, update tests
igerber Mar 29, 2026
867cd51
Fix panel M2 full-sample colMeans, add HonestDiD consecutive event-ti…
igerber Mar 29, 2026
9f3cab4
Fix HonestDiD grid validator for reference-period gap, defensive boot…
igerber Mar 29, 2026
c529053
HonestDiD: raise ValueError on non-consecutive event-time grid (was w…
igerber Mar 29, 2026
e9995ef
HonestDiD: validate full grid around reference period, not just withi…
igerber Mar 29, 2026
1f8a537
Fix HonestDiD: reference-aware pre/post split, replicate df=0 sentinel
igerber Mar 29, 2026
c5015c7
Fix _estimate_max_pre_violation to use reference-aware pre_periods
igerber Mar 29, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 25 additions & 8 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,42 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).

## Current Status

diff-diff v2.6.0 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
diff-diff v2.7.5 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis, plus **unique survey support** — design-based variance estimation (Taylor linearization, replicate weights) integrated across all estimators. No R or Python package offers this combination:

- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022), Stacked DiD (Wing et al. 2024), Continuous DiD (Callaway, Goodman-Bacon & Sant'Anna 2024)
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
- **Study design**: Power analysis tools
- **Data utilities**: Real-world datasets (Card-Krueger, Castle Doctrine, Divorce Laws, MPDTA), DGP functions for all supported designs
- **Survey support**: Full `SurveyDesign` with strata, PSU, FPC, weight types, replicate weights (BRR/Fay/JK1/JKn), Taylor linearization, DEFF diagnostics, subpopulation analysis — integrated across all estimators (see [survey-roadmap.md](docs/survey-roadmap.md))
- **Performance**: Optional Rust backend for accelerated computation; faster than R at scale (see [CHANGELOG.md](CHANGELOG.md) for benchmarks)

---

## Near-Term Enhancements (v2.7)
## Near-Term Enhancements (v2.8)

### Survey Phase 7: Completing the Survey Story

Close the remaining gaps for practitioners using major population surveys
(ACS, CPS, BRFSS, MEPS). See [survey-roadmap.md](docs/survey-roadmap.md) for
full details.

- **CS Covariates + IPW/DR + Survey** *(High priority)*: Implement DRDID
nuisance IF corrections under survey weights. Currently the recommended DR
method raises `NotImplementedError` with covariates + survey. This is the
most commonly needed path in applied work (Medicaid expansion, minimum wage).
- **Repeated Cross-Sections** *(High priority)*: `panel=False` support for
CallawaySantAnna, enabling analysis of surveys that don't track units over
time (BRFSS, ACS annual, CPS monthly). Uses cross-sectional DRDID
(Sant'Anna & Zhao 2020, Section 4).
- **Survey-Aware DiD Tutorial** *(High priority)*: Jupyter notebook
demonstrating the full workflow with realistic survey data. diff-diff is
the only package (R or Python) with design-based variance for modern DiD
— this makes that capability discoverable.
- **HonestDiD + Survey Variance** *(Medium priority)*: Pass survey vcov
(TSL or replicate) into sensitivity analysis instead of cluster-robust vcov,
so sensitivity bounds respect the same variance structure as main estimates.

### Staggered Triple Difference (DDD)

Expand All @@ -32,12 +55,6 @@ Extend the existing `TripleDifference` estimator to handle staggered adoption se

**Reference**: [Ortiz-Villavicencio & Sant'Anna (2025)](https://arxiv.org/abs/2505.09942). *Working Paper*. R package: `triplediff`.

### Enhanced Visualization

- Synthetic control weight visualization (bar chart of unit weights)
- Treatment adoption "staircase" plot for staggered designs
- Interactive plots with plotly backend option

---

## Medium-Term Enhancements
Expand Down
20 changes: 13 additions & 7 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,17 @@ Deferred items from PR reviews that were not addressed before merge.
|-------|----------|----|----------|
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
| CallawaySantAnna survey + covariates + IPW/DR: DRDID panel nuisance-estimation IF corrections not implemented. Currently gated with NotImplementedError. Regression method with covariates works. | `staggered.py` | #233 | Medium — tracked in Survey Phase 7a |
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. Needs design decision for `anticipation>0`. | `efficient_did.py` | #230 | Low |
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. | `prep_dgp.py`, `power.py` | #208 | Low |
| Survey design resolution/collapse patterns inconsistent across panel estimators — extract shared helpers for panel-to-unit collapse, post-filter re-resolution, metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` | — | Low |
| Replicate-weight survey df — **Resolved**. `df_survey = rank(replicate_weights) - 1` matching R's `survey::degf()`. For IF paths, `n_valid - 1` when dropped replicates reduce effective count. | `survey.py` | #238 | Resolved |
| CallawaySantAnna survey: strata/PSU/FPC — **Resolved**. Aggregated SEs (overall, event study, group) use `compute_survey_if_variance()`. Bootstrap uses PSU-level multiplier weights. | `staggered.py` | #237 | Resolved |
| CallawaySantAnna survey + covariates + IPW/DR — **Resolved**. DRDID panel nuisance IF corrections (PS + OR) implemented for both survey and non-survey DR paths (Phase 7a). IPW path unblocked. | `staggered.py` | #233 | Resolved |
| SyntheticDiD/TROP survey: strata/PSU/FPC — **Resolved**. Rao-Wu rescaled bootstrap implemented for both. TROP uses cross-classified pseudo-strata. Rust TROP remains pweight-only (Python fallback for full design). | `synthetic_did.py`, `trop.py` | — | Resolved |
| EfficientDiD hausman_pretest() clustered covariance stale `n_cl` — **Resolved**. Recompute `n_cl` and remap indices after `row_finite` filtering via `np.unique(return_inverse=True)`. | `efficient_did.py` | #230 | Resolved |
| EfficientDiD `control_group="last_cohort"` trims at `last_g - anticipation` but REGISTRY says `t >= last_g`. With `anticipation=0` (default) these are identical. With `anticipation>0`, code is arguably more conservative (excludes anticipation-contaminated periods). Either align REGISTRY with code or change code to `t < last_g` — needs design decision. | `efficient_did.py` | #230 | Low |
| TripleDifference power: `generate_ddd_data` is a fixed 2×2×2 cross-sectional DGP — no multi-period or unbalanced-group support. Add a `generate_ddd_panel_data` for panel DDD power analysis. | `prep_dgp.py`, `power.py` | #208 | Low |
| ContinuousDiD event-study aggregation anticipation filter — **Resolved**. `_aggregate_event_study()` now filters `e < -anticipation` when `anticipation > 0`, matching CallawaySantAnna behavior. Bootstrap paths also filtered. | `continuous_did.py` | #226 | Resolved |
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| Survey metadata formatting dedup — **Resolved**. Extracted `_format_survey_block()` helper in `results.py`, replaced 13 occurrences across 11 files. | `results.py` + 10 results files | — | Resolved |
| TROP: `fit()` and `_fit_global()` share ~150 lines of near-identical data setup (panel pivoting, absorbing-state validation, first-treatment detection, effective rank, NaN warnings). Both bootstrap methods also duplicate the stratified resampling loop. Extract shared helpers to eliminate cross-file sync risk. | `trop.py`, `trop_global.py`, `trop_local.py` | — | Low |

#### Performance

Expand Down Expand Up @@ -161,8 +167,8 @@ Features in R's `did` package that block porting additional tests:

| Feature | R tests blocked | Priority | Status |
|---------|----------------|----------|--------|
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | PlannedSurvey Phase 7b |
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | Mostly resolved (Phases 1-6); CS IPW/DR + covariates + survey in Phase 7a |
| Repeated cross-sections (`panel=FALSE`) | ~7 tests in test-att_gt.R + test-user_bug_fixes.R | High | **Resolved** — Phase 7b: `panel=False` on CallawaySantAnna |
| Sampling/population weights | 7 tests incl. all JEL replication | Medium | **Resolved** (Phases 1-6 + 7a: CS IPW/DR + covariates + survey) |
| Calendar time aggregation | 1 test in test-att_gt.R | Low | |

---
Expand Down
Loading
Loading