This project presents a probabilistic forecasting framework for daily maximum temperature in Los Angeles, aimed at supporting agricultural decision-making and market-based weather risk management. A range of models is initially trained and evaluated, and the best-performing configuration, based on LightGBM regressors for both mean and quantile estimation, is selected. The final model constructs predictive Gaussian distributions to estimate the likelihood of extreme temperature events. Forecasts are made accessible via a RESTful API and integrated into a web application that allows users to explore probability distributions, assess threshold exceedance risks, and compare model estimates with market-implied probabilities. The tool also implements a basic decision rule inspired by prediction markets.
The model uses daily atmospheric variables from the Meteostat LAX station (tavg, tmin, tmax, prcp, wdir, wspd, pres), with tmax as the one-step-ahead target and its lagged values (1–7 days) to capture autoregressive dynamics. It is further enriched with soil temperature and soil moisture (0–7 cm and 7–28 cm), sea surface temperature (SST) anomalies, and an interaction term between shallow soil temperature and atmospheric pressure to model land–atmosphere and ocean–atmosphere coupling effects.
- Prophet – Used as an interpretable statistical baseline to model trend and seasonality in daily maximum temperature
- LSTM – Applied to capture nonlinear temporal dependencies in multivariate weather sequences
- Deep Ensemble – Implemented to obtain probabilistic forecasts and quantify predictive uncertainty
- Quantile Regression Forests – Used to estimate upper-tail quantiles and construct prediction intervals without distributional assumptions
- XGBoost – Employed for robust nonlinear modeling on structured meteorological data
- LightGBM – Selected as the final core model for accurate mean prediction and 90th percentile estimation in extreme temperature events
Temperature-Prediction/
├── api/ # FastAPI backend
│ ├── main.py # API endpoints
│ └── model.py # Model loading and prediction
├── web/ # Frontend application
│ └── web_app.py # Streamlit web interface
├── data/ # Data files
│ └── LA_dataset.csv # Training dataset
├── models/ # Trained models
│ └── model.lgb # LightGBM model
├── scripts/ # Utility scripts
│ ├── create_test_data.py # Test data generation
│ └── test_api.py # API testing
└── requirements.txt # Python dependencies
-
Install dependencies:
pip install -r requirements.txt
-
Start the API server:
python -m uvicorn api.main:app --reload
-
Launch the web application:
streamlit run web/web_app.py
- Select a date from the available range
- View the predicted maximum temperature and uncertainty
- Analyze probability thresholds
- Compare with market probabilities
- Explore historical trends
- Python 3.8+
- FastAPI
- Streamlit
- LightGBM
- Pandas
- NumPy
- Plotly