Skip to content

Conversation

@joto
Copy link
Collaborator

@joto joto commented Jan 29, 2026

The number of tiles to be expired can be quite large if the input geometries are large or if there are many geometries. Numbers of tiles in the billions can crash osm2pgsql because it runs out of memory. Such large numbers can also overwhelm any kind of re-rendering mechanism run after osm2pgsql to bring tiles up to date. In day-to-day processing this should not happen, but it can happen due to vandalism or misconfiguration.

To protect against this problem, this change introduces limits on the number of tiles that can be affected by a single geometry and the overall number of tiles that an expire output will generate for each run of osm2pgsql.

  • If a single geometry would result in the expire of more than max_tiles_geometry this geometry will be ignored for the purposes of expiry. Note that the geometry will still be written to the database, but no tiles will be added to the expire output.
  • If the number of tiles generated during a single run of osm2pgsql for an expire output grows beyond max_tiles_overall, no further tiles will be written to this output.

Limits are per expire output of which you can have several. The limits can be set in the flex expire output configuration but sensible defaults are provided. For the (legacy) expire output configured on the command line with the -e and -o options, the settings can not be changed, you will always get the default values.

To choose the default values for these settings I looked at real-world values as follows:

  • Russia has one of the largest boundaries in the planet. Expiry (boundary only) on zoom level 14 affects 94144 tiles, on z15 190168 tiles, on z16 383465 tiles. For typical raster tiles using 8x8 meta tiles expiry on z16 is equivalent to showing z19 tiles. So 500,000 tiles seems to be a useful limit for max_tiles_geometry.
  • For expiring the area I looked at the Greenland icesheet, which needs more than 8 million tiles on z14. At least for vector tiles this is good enough, for raster tiles we might need more though.
  • For max_tiles_overall: Paul Norman analyzed the number of tiles expired by typical minutely updates in https://www.openstreetmap.org/user/pnorman/diary/403266. For zoom level 14 the most he got was 119801 tiles. The same analysis also shows that for longer time frames (checked were 2 minutes and 5 minutes, but the same should be true for larger intervals) the number of tiles doesn't go up because these huge numbers only happen very rarely.

Rounding these numbers and adding a safety factor, values of 10,000,000 and 50,000,000 seem reasonable for the single geometry and the overall number of tiles per run. Memory use in osm2pgsql is about 32 bytes per tile, so this will need 1.6 GB max which should be no problem at all.

The numbers are chosen so they will practically never be triggered so that users upgrading from existing versions of osm2pgsql will not be suddenly affected. It is recommended that users tune their settings according to their own needs. Once we have some more operational experience with this, we can adjust the defaults.

I considered using different default max values for different zoom levels, but this will make configuration more complicated.

Change file processing in osm2pgsql runs in parallel threads. The old code stored the to-be-expired tiles in one list per thread and merged them later. This has two problems:

a) because the lists might contain some of the same tiles, all lists
together can use a much larger amount than a single list would take
b) we can not easily check the number of tiles in those lists against
the configured maximum.

So this commit changes the way the list is kept: We only keep a single list in the expire_output_t and use a mutex to control access to this list. (There might still be overlapping lists if you have more than one expire output, but that's by design.)

Objects of expire_tiles_t class now only keep a temporary list for each geometry added. Once all tiles affected by a single geometry are identified, this list is added to the overall list in expire_output_t and the temporary list is cleared.

Fixes #2190

The number of tiles to be expired can be quite large if the input
geometries are large or if there are many geometries. Numbers of tiles in
the billions can crash osm2pgsql because it runs out of memory. Such
large numbers can also overwhelm any kind of re-rendering mechanism run
after osm2pgsql to bring tiles up to date. In day-to-day processing this
should not happen, but it can happen due to vandalism or misconfiguration.

To protect against this problem, this change introduces limits on the
number of tiles that can be affected by a single geometry and the
overall number of tiles that an expire output will generate for each run
of osm2pgsql.

* If a single geometry would result in the expire of more than
  `max_tiles_geometry` this geometry will be ignored for the purposes of
  expiry. Note that the geometry will still be written to the database,
  but no tiles will be added to the expire output.
* If the number of tiles generated during a single run of osm2pgsql for an
  expire output grows beyond `max_tiles_overall`, no further tiles will be
  written to this output.

Limits are per expire output of which you can have several. The limits
can be set in the flex expire output configuration but sensible defaults
are provided. For the (legacy) expire output configured on the command
line with the `-e` and `-o` options, the settings can not be changed,
you will always get the default values.

To choose the default values for these settings I looked at real-world
values as follows:

* Russia has one of the largest boundaries in the planet. Expiry
  (boundary only) on zoom level 14 affects 94144 tiles, on z15 190168
  tiles, on z16 383465 tiles. For typical raster tiles using 8x8 meta
  tiles expiry on z16 is equivalent to showing z19 tiles. So 500,000 tiles
  seems to be a useful limit for `max_tiles_geometry`.
* For expiring the area I looked at the Greenland icesheet, which needs
  more than 8 million tiles on z14. At least for vector tiles this is
  good enough, for raster tiles we might need more though.
* For `max_tiles_overall`: Paul Norman analyzed the number of tiles
  expired by typical minutely updates in
  https://www.openstreetmap.org/user/pnorman/diary/403266. For zoom level
  14 the most he got was 119801 tiles. The same analysis also shows that
  for longer time frames (checked were 2 minutes and 5 minutes, but the
  same should be true for larger intervals) the number of tiles doesn't go
  up because these huge numbers only happen very rarely.

Rounding these numbers and adding a safety factor, values of 10,000,000
and 50,000,000 seem reasonable for the single geometry and the overall
number of tiles per run. Memory use in osm2pgsql is about 32 bytes per
tile, so this will need 1.6 GB max which should be no problem at all.

The numbers are chosen so they will practically never be triggered so
that users upgrading from existing versions of osm2pgsql will not be
suddenly affected. It is recommended that users tune their settings
according to their own needs. Once we have some more operational
experience with this, we can adjust the defaults.

I considered using different default max values for different zoom
levels, but this will make configuration more complicated.

Change file processing in osm2pgsql runs in parallel threads. The old
code stored the to-be-expired tiles in one list per thread and merged
them later. This has two problems:

a) because the lists might contain some of the same tiles, all lists
   together can use a much larger amount than a single list would take
b) we can not easily check the number of tiles in those lists against
   the configured maximum.

So this commit changes the way the list is kept: We only keep a single
list in the expire_output_t and use a mutex to control access to this
list. (There might still be overlapping lists if you have more than
one expire output, but that's by design.)

Objects of expire_tiles_t class now only keep a temporary list for
each geometry added. Once all tiles affected by a single geometry are
identified, this list is added to the overall list in expire_output_t
and the temporary list is cleared.

Fixes osm2pgsql-dev#2190
@joto joto force-pushed the limit-expire-tiles branch from 0552bda to af89bc8 Compare January 29, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

osm2pgsql using too much memory for expire lists

1 participant