feat: auto-scaling voxel grid mapper by spomichter · Pull Request #1271 · dimensionalOS/dimos

spomichter · 2026-02-16T08:18:34Z

Summary

Closes #1270 — Auto-scaling global voxel grid mapper.

What

Adds adaptive frame skipping to VoxelGridMapper so it self-regulates on weaker machines instead of falling behind on lidar ingest.

How

When autoscale=True (default), _on_frame() skips frames if they arrive faster than the last add_frame() processing duration. The system self-regulates: fast machines process every frame, slow machines find their natural ceiling.

autoscale_min_frequency (default 1.0 Hz) prevents the map from going completely stale — even if processing takes 2s, we'll still ingest at least 1 frame per second.

Config

@dataclass
class Config(ModuleConfig):
    autoscale: bool = True
    autoscale_min_frequency: float = 1.0  # never drop below this Hz

Telemetry (Rerun)

Logs both ingest and publish timings separately to identify the real bottleneck:

voxel_mapper/ingest_time_ms — add_frame() wall-clock cost
voxel_mapper/publish_time_ms — get_global_pointcloud2() + publish cost
voxel_mapper/map_size — total voxel count (correlates with publish cost growth)
voxel_mapper/frames_skipped — cumulative frames dropped by autoscaler

Testing

4 new tests:

test_autoscale_skips_when_saturated — simulates slow machine (fakes 500ms processing), verifies frames are skipped
test_autoscale_disabled_processes_all — verifies autoscale=False processes every frame
test_autoscale_min_frequency_respected — verifies high min_frequency prevents excessive skipping
test_autoscale_rerun_logging — verifies rerun telemetry is logged correctly

All existing tests unchanged and passing.

How to Test

pytest -sv dimos/mapping/test_voxels.py -k autoscale

To observe autoscaling live with rerun on a real robot:

dimos run go2-nav  # or any blueprint using voxel_mapper
# Open Rerun viewer → check voxel_mapper/* time series

Add adaptive frame skipping to VoxelGridMapper when processing can't keep up with incoming lidar rate. Self-regulates by using last add_frame() duration as the throttle interval, with a configurable minimum frequency floor. Config: autoscale: bool = True autoscale_min_frequency: float = 1.0 # Hz floor Telemetry via rerun: voxel_mapper/ingest_time_ms - add_frame() cost per frame voxel_mapper/publish_time_ms - get_global_pointcloud2() cost voxel_mapper/map_size - total voxel count voxel_mapper/frames_skipped - cumulative skipped frames Closes #1270

greptile-apps · 2026-02-16T08:24:36Z

Greptile Summary

Adds adaptive frame skipping to VoxelGridMapper so it self-regulates on weaker machines, along with Rerun telemetry for ingest time, publish time, map size, and frames skipped. Two new config fields (autoscale, autoscale_min_frequency) control the behavior.

_on_frame() now measures add_frame() wall-clock cost and skips incoming frames that arrive faster than the last processing duration, capped by a minimum frequency.
Rerun telemetry (rr.log) is added to both _on_frame and publish_global_map for observability.
A defensive hasattr(self, "_publish_trigger") guard was added to support calling _on_frame before start() (used in tests).
Issue: 1.0 / autoscale_min_frequency will raise ZeroDivisionError if a user sets this config value to 0.
Issue: _last_ingest_time is set to the time before add_frame() executes. Since all transports (LCM, SharedMemory, ROS) deliver callbacks synchronously on a single thread, elapsed_since_last will always be >= _last_ingest_duration, making the skip condition unreachable in production. The autoscale only works in the test because _last_ingest_duration is artificially inflated.

Confidence Score: 3/5

The PR introduces a useful feature but has a runtime crash risk (ZeroDivisionError) and the core autoscale logic is ineffective with the current synchronous transport model.
Score of 3 reflects two issues: (1) an unguarded division by zero on a user-configurable field that will crash at runtime, and (2) the autoscale skip logic is effectively dead code in production because _last_ingest_time is set before processing starts, and all transport callbacks are synchronous. The tests pass by faking internal state rather than exercising the real production code path, masking the timing issue.
Pay close attention to dimos/mapping/voxels.py — the _on_frame method's timing logic and the ZeroDivisionError risk in the autoscale configuration.

Important Files Changed

Filename	Overview
dimos/mapping/voxels.py	Adds autoscale frame-skipping and Rerun telemetry to VoxelGridMapper. Has a ZeroDivisionError risk with autoscale_min_frequency=0, and the autoscale skip logic is ineffective with synchronous transport delivery due to _last_ingest_time being set before processing.
dimos/mapping/test_voxels.py	Adds 4 tests for autoscale behavior. Tests are well-structured but the saturation test relies on faking internal state (_last_ingest_duration) rather than testing the actual production code path, masking the timing issue in the implementation.

Sequence Diagram

sequenceDiagram
    participant LCM as LCM Transport Thread
    participant OF as _on_frame()
    participant AF as add_frame()
    participant RR as Rerun Telemetry
    participant PT as _publish_trigger
    participant PGM as publish_global_map()

    LCM->>OF: frame arrives
    alt autoscale ON & frames_processed > 0
        OF->>OF: elapsed = now - _last_ingest_time
        OF->>OF: throttle = min(last_duration, 1/min_freq)
        alt elapsed < throttle
            OF-->>LCM: skip (return early)
            Note right of OF: _frames_skipped++
        end
    end
    OF->>AF: add_frame(frame)
    AF-->>OF: done (ingest_duration measured)
    OF->>OF: update _last_ingest_time, _last_ingest_duration
    OF->>RR: log ingest_time_ms, map_size, frames_skipped
    alt publish_interval == 0
        OF->>PT: on_next(None)
        PT->>PGM: publish_global_map()
        PGM->>RR: log publish_time_ms
        PGM->>PGM: global_map.publish(pc)
    end

_{Last reviewed commit: f2ec462}

greptile-apps

_{2 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-16T08:24:39Z

dimos/mapping/voxels.py

ZeroDivisionError if autoscale_min_frequency is 0

1.0 / self.config.autoscale_min_frequency will raise ZeroDivisionError at runtime if a user passes autoscale_min_frequency=0. There's no validation preventing this value. Consider adding a guard or a config validation:

Suggested change

max_interval = 1.0 / self.config.autoscale_min_frequency

max_interval = (1.0 / self.config.autoscale_min_frequency) if self.config.autoscale_min_frequency > 0 else float("inf")

greptile-apps · 2026-02-16T08:24:40Z

dimos/mapping/voxels.py

+        t0 = time.monotonic()
        self.add_frame(frame)
-        if self.config.publish_interval == 0:
+        ingest_duration = time.monotonic() - t0
+
+        self._last_ingest_time = t0
+        self._last_ingest_duration = ingest_duration


_last_ingest_time = t0 limits autoscale effectiveness

_last_ingest_time is set to t0 (the time before add_frame() runs). Since all transports (LCM, SharedMemory, ROS) deliver callbacks synchronously on a single thread, the next _on_frame call can only begin after the current one returns. This means elapsed_since_last = now - t0 will always be >= _last_ingest_duration, so the skip condition elapsed < throttle_interval is effectively never true in production.

The autoscale will only skip frames when _on_frame is called directly from a separate thread (not through the transport), or if _last_ingest_duration is artificially set higher than the actual processing time (as done in the test).

If you want skipping to work with synchronous transport delivery, consider setting _last_ingest_time to time.monotonic() (after processing completes), so the elapsed-time check measures the gap between frames exclusive of processing time:

Suggested change

t0 = time.monotonic()

self.add_frame(frame)

if self.config.publish_interval == 0:

ingest_duration = time.monotonic() - t0

self._last_ingest_time = t0

self._last_ingest_duration = ingest_duration

t0 = time.monotonic()

self.add_frame(frame)

ingest_duration = time.monotonic() - t0

self._last_ingest_time = time.monotonic()

self._last_ingest_duration = ingest_duration

With this change, frames that arrive back-to-back (queued in LCM during processing) would be correctly identified as arriving faster than the processing rate.

rr.log() was silently no-op because VoxelGridMapper runs in a separate Dask worker process from the rerun bridge. Now: - _ensure_rr() calls rr.init('dimos', spawn=False) in the mapper process - Uses rr.set_time() with sequence + timestamp for proper time series - Graphs will appear in Rerun viewer under voxel_mapper/*

Two bugs flagged by Greptile: 1. ZeroDivisionError: autoscale_min_frequency=0 would crash. Fix: guard with > 0 check, treat 0 as 'no frequency floor'. 2. Skip logic was dead code: _last_ingest_time was set before add_frame(), so elapsed_since_last always >= _last_ingest_duration on synchronous transport threads — frames never got skipped. Fix: set _last_ingest_time after add_frame() completes, so it measures idle time between frames. If idle time < last processing duration, the machine is falling behind and the frame is skipped. Also adds test_autoscale_min_frequency_zero_no_crash to cover the ZeroDivisionError case, and updates the saturation test to correctly set both _last_ingest_time and _last_ingest_duration.

spomichter · 2026-02-17T10:36:55Z

Fixed both issues from Greptile review (3639cb6):

1. ZeroDivisionError on autoscale_min_frequency=0
Added a guard: if min_frequency <= 0, fall back to using processing time alone as the throttle interval (no frequency floor).

2. Skip logic was dead code
Greptile was correct — _last_ingest_time was set to t0 (before add_frame()), so on synchronous transport threads elapsed_since_last always >= _last_ingest_duration. Frames never actually got skipped in production.

Fix: set _last_ingest_time = time.monotonic() after add_frame() completes. Now elapsed_since_last measures idle time between frames. If idle < last processing duration, the machine is genuinely behind and we skip.

Also added test_autoscale_min_frequency_zero_no_crash and updated the saturation test to correctly fake both _last_ingest_time (set to now) and _last_ingest_duration to simulate a machine that just finished a slow frame.

spomichter · 2026-02-17T16:58:56Z

@leshy this looks like slop to me. def concerned if the agent cant even do this

greptile-apps bot reviewed Feb 16, 2026

View reviewed changes

spomichter added 3 commits February 16, 2026 08:29

feat: gate rerun telemetry behind enable_telemetry config flag

0500b1e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-scaling voxel grid mapper#1271

feat: auto-scaling voxel grid mapper#1271
spomichter wants to merge 4 commits intodevfrom
feat/1270-autoscale-voxel-mapper

spomichter commented Feb 16, 2026

Uh oh!

greptile-apps bot commented Feb 16, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 16, 2026

Uh oh!

greptile-apps bot Feb 16, 2026

Uh oh!

spomichter commented Feb 17, 2026

Uh oh!

spomichter commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

	max_interval = 1.0 / self.config.autoscale_min_frequency
	max_interval = (1.0 / self.config.autoscale_min_frequency) if self.config.autoscale_min_frequency > 0 else float("inf")

Conversation

spomichter commented Feb 16, 2026

Summary

What

How

Config

Telemetry (Rerun)

Testing

How to Test

Uh oh!

greptile-apps bot commented Feb 16, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

spomichter commented Feb 17, 2026

Uh oh!

spomichter commented Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments