Skip to content

pesout/offline-weather-forecast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Offline Weather Forecast

Weather forecast models (precipitation, wind) based on input data that can be estimated or measured by anyone with a smartphone – even without special equipment and internet connection.

Deployed version – weather.pesout.net

Implementation summary

I trained weather forecast models using a long-term dataset, keeping only inputs that can be collected without specialized instruments. Resulting Python ML models (XGBoost – gradient boosted trees) was tested and exported to JavaScript (m2cgen) so they can run in any browser.

Basic usage (quick start)

The fastest way is to use the prepared HTML page with already imported pre-trained models.

1. Start a local web server from the project root:

python3 -m http.server 8000

2. Open it in your browser: go to http://localhost:8000

Advanced usage

Model training (and JS export)

Requirements:

  • Python 3.8 or higher
  • Weather dataset (CSV format)

The original dataset is published in a separate GitHub repository; see weather-dataset.

1. Install dependencies:

pip install -e .

2. Train and export models:

python3 train_exportable.py --input weather.csv

See train_exportable.py for all possible advanced options.

3. Check exported models:

  • ./public/precipitation_model.js - precipitation occurrence classifier
  • ./public/precipitation_amt_model.js - precipitation amount regressor
  • ./public/wind_model.js - wind speed regressor

Example usage in JavaScript

import { predictWeather } from './predict.js';

const forecast = predictWeather({
   latitude: 50.0755,
   longitude: 14.4378,
   altitude: 200,
   airPressure: 1013.25,
   temperature: 15,
   cloudCover: 0.5,
   windDirection: 'NW',
   windCategory: 'light',
   hour: 14,
   dayOfYear: 180
});

console.log(forecast);
// { precipitationProb: 0.23, precipitationAmount: 0.45, windSpeed: 3.2 }

Model input and output

Input requirements:

  • Basic location data (latitude, longitude, altitude)
  • Date and time
  • Simple observations: temperature, air pressure, cloud cover, current wind (speed and direction)

Since exact wind speed is hard to measure without any tools, I use categories based on the Beaufort scale.

Output (6-hour forecast):

  • Precipitation occurrence - probability of rain/snow in the next 6 hours
  • Precipitation amount - expected rainfall in millimeters (if precipitation occurs)
  • Wind speed - wind speed in meters per second

Dataset

At the end of 2021, I launched a cron job that downloads current weather and forecasts from Locationforecast by MET Norway. It saves data from five different locations (CZ, SK) twice a day (7 AM and 2 PM). Thus, in December 2025, there are about 14,600 records.

The original dataset had 27 variables – see weather.example.csv, but I excluded or modified most of them. In addition to those that are unnecessary for training the ML model, it was necessary to ignore inputs that cannot be obtained without equipment (e.g., air humidity). I also excluded data on changes after 1 hour, because I wanted to generate predictions from immediate observations.

Variables/columns in the final dataset:

  • date - Date string (e.g., "2025-01-03")
  • time - Time string (e.g., "14:00:00")
  • latitude - Latitude in decimal degrees
  • longitude - Longitude in decimal degrees
  • altitude - Altitude in meters above sea level
  • air_pressure - Air pressure in hPa
  • air_temperature - Temperature in Celsius
  • cloud_area_fraction - Cloud cover in percentage (0-100)
  • wind_from_direction - Wind direction in degrees (0-360)
  • wind_speed - Current wind speed in m/s (converted to a category before training)
  • precipitation_amount_next_6h - Precipitation amount in next 6h (mm) - target variable
  • wind_speed_next_6h - Wind speed in next 6h (m/s) - target variable

Metrics and performance

The performance of the models should be viewed in the context that many variables that are otherwise important for weather forecasting had to be excluded. At the same time, there are more suitable algorithms for this case than XGBoost, which was chosen because of JavaScript export. The character of the project also precluded data processing as time series.

The referenced model was trained with n_estimators=250 and performance was evaluated on 20% holdout from the most recent data.

Precipitation Occurrence

  • ROC-AUC: 0.7959 - Good discrimination between rain/no-rain
  • PR-AUC: 0.7502 - Decent precision-recall balance for imbalanced precipitation events
  • Brier Score: 0.1844 - Low calibration error

The model reliably distinguishes rainy from dry conditions with ~80% discriminative ability.

Precipitation Amount (mm, 6-hour forecast)

  • MAE (conditional on precipitation): 1.1370 mm - Mean absolute error for cases, when rain actually occurs
  • MAE (general): 0.6443 mm - Mean absolute error when combining occurrence probability × amount

Baseline comparison:

  • Always predicting zero: 0.6329 mm MAE
  • Always predicting mean precipitation: 1.3750 mm MAE

The model slightly underperforms a naive "always dry" baseline but significantly beats predicting average rainfall. This is expected for rare precipitation events where predicting "no rain" is often correct.

Wind Speed (m/s, 6-hour forecast)

  • MAE: 0.8128 m/s
  • RMSE: 1.0281 m/s

Baseline comparison:

  • Always predicting the same value as input: 1.2124 m/s MAE

The model predicts wind speed with ~0.81 m/s accuracy, beating the persistence baseline by 33%. This is good performance beyond simple assumptions.

Limitations

The training data comes from only five locations, which I chose quite randomly without any particular strategy. It can therefore be expected that the accuracy of the predictions will decrease in geographical locations far from the coordinates in the dataset.

Furthermore, the target variables values are not actual weather conditions, but the meteorological institute's forecast for the next 6 hours. The question is whether this really has a negative impact on accuracy, because I assume that this approach allows us to exclude unexpected situations from the data.

About

Weather forecast models (precipitation, wind) based on input data that can be estimated or measured by anyone with a smartphone – even without special equipment and internet connection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages