This is an R-based research project analyzing COVID-19 forecast accuracy across European models. The study examines how model structure and geographic specificity influence forecast performance using data from the European COVID-19 Forecast Hub.
How do model structure (mechanistic vs statistical) and geographic specificity (single-location vs multi-location models) affect forecast accuracy after adjusting for predictive difficulty?
See report/Research-narrative.md for additional project context.
-
process-score.R: Computes forecast scores using the
scoringutilspackage- Scores forecasts on both natural and log scales
- Calculates weighted interval scores (WIS)
- Outputs:
data/scores-raw-{case|death}.csv
-
process-data.R: Data preparation and integration
- Combines scores with explanatory variables (model classification, variant phases, country targets)
- Calls utility functions for metadata, variants, and location data
-
analysis-model.R: Main statistical analysis using Generalized Additive Mixed Models (GAMM)
- Models WIS adjusting for: trend, location, time, horizon, model-specific effects
- Isolates impact of Method (model structure) and CountryTargets (geographic specificity)
- Uses
mgcv,gammit, andgratiapackages - Outputs:
output/results.rds
-
analysis-descriptive.R: Descriptive statistics and summary tables
- Bootstrap confidence intervals
- Score distributions by model characteristics
-
plot-model-results.R: Visualization of GAMM model effects
- Adjusted vs unadjusted effects by model
- Supports anonymized output for peer review
-
plot-model-flow.R: Workflow and flowchart visualizations
- utils-data.R: Functions for accessing forecasts, observations, and population data
- utils-metadata.R: Model names, submissions, and metadata classification helpers
- utils-variants.R: COVID-19 variant phase classification
covid19-forecast-hub-europe.parquet: Raw forecast submissionsobserved-{case|death}.csv: Observed incidence datamodel-classification.csv: Model categorization by structure and specificitypopulations.csv: Population data by locationscores-raw-{case|death}.csv: Computed forecast scores (generated)
report/Revision_manuscript.md— full manuscript text (title, abstract, background, methods, results, discussion). Edit this file for any writing changes.report/Research-narrative.md— Overall narrative of the research, and paragraph-by-paragraph one-line summary of the manuscript textsubmission/reviewer-response-analysis.md— tracks reviewer suggestions and planned response; X marks completion. Consult when making revision-related changes.
report/results.qmd— active Quarto document; sources R scripts and renders figures/tables for the results sectionreport/supplement/Supplement.Rmd— supplementary materials; sources the same R scriptsreport/results.Rmd— legacy RMarkdown copy of results (inactive; use.qmd)- Pre-print: medRxiv 10.1101/2025.04.10.25325611
Note: manuscript prose and rendered analysis are separate. Revision_manuscript.md is not auto-generated — changes to analysis code and changes to Manuscript text must be coordinated manually.
# Install renv if needed
install.packages("renv")
# Restore package environment
renv::restore()# 1. Score forecasts on natural and log scales
source(here("R", "process-score.R"))
# 2. Prepare and integrate data
source(here("R", "process-data.R"))
# 3. Fit GAMM to weighted interval scores
source(here("R", "analysis-model.R"))
# 4. Generate reports
# Render report/results.qmd
# Knit report/supplement/Supplement.Rmd| Task | Where to edit |
|---|---|
| Change manuscript prose (wording, framing, conclusions) | report/Revision_manuscript.md + update in Research-narrative.md |
| Change analysis, model, or figures | Relevant R/ script; outputs flow into results.qmd automatically |
| Respond to a reviewer comment | Check report/Revision_reviews-response.md, update R/ script if needed, then update report/Revision_manuscript.md, mark as completed in report/Revision_reviews-response.md, and close the relevant Github Issue with a note |
| Add or change a supplementary figure | Relevant R/ script + report/supplement/Supplement.Rmd |
| All changes | Update Plan.md |
Major R packages:
mgcv- Generalized Additive Modelsgammit- GAMM utilitiesgratia- GAM plottingscoringutils- Forecast scoringarrow- Parquet file handlingtidyverseecosystem (dplyr, tidyr, ggplot2, readr, purrr)here- Path managementlubridate- Date handling
- DOI: 10.5281/zenodo.14903161
- Pre-print: 10.1101/2025.04.10.25325611
- Slides: Google Slides