WIP perf(vm): optimize loop iteration with scope pool #908

thevilledev · 2026-01-19T20:56:54Z

Motivation

Iteration over slices previously used reflection to access elements, which was slow and allocated unnecessarily.

Changes

This change adds type-specialized fast paths for common slice types ([]int, []float64, []string, []any) that bypass reflection entirely. Scope objects are now pooled and reused across loop iterations. The current scope pointer is cached to avoid repeated slice lookups.

Bench run:

go test -bench='Benchmark_filter' -benchmem -count=10 -run=^$ ./...

Results:

cpu: Apple M1 Pro
               │   old.txt    │               new.txt               │
               │    sec/op    │   sec/op     vs base                │
_filter-8        48.99µ ±  4%   40.79µ ± 6%  -16.73% (p=0.000 n=10)
_filterLen-8     44.64µ ± 14%   37.64µ ± 6%  -15.67% (p=0.000 n=10)
_filterFirst-8   430.4n ±  4%   362.0n ± 2%  -15.88% (p=0.000 n=10)
_filterLast-8    798.2n ± 12%   766.6n ± 3%   -3.96% (p=0.009 n=10)
_filterMap-8     5.231µ ±  1%   3.742µ ± 4%  -28.47% (p=0.000 n=10)
geomean          5.234µ         4.370µ       -16.51%
               │   old.txt    │               new.txt                │
               │     B/op     │     B/op      vs base                │
_filter-8        20.38Ki ± 0%   18.20Ki ± 0%  -10.69% (p=0.000 n=10)
_filterLen-8     7.938Ki ± 0%   6.039Ki ± 0%  -23.92% (p=0.000 n=10)
_filterFirst-8     192.0 ± 0%     224.0 ± 0%  +16.67% (p=0.000 n=10)
_filterLast-8      712.0 ± 0%     808.0 ± 0%  +13.48% (p=0.000 n=10)
_filterMap-8      1736.0 ± 0%     920.0 ± 0%  -47.00% (p=0.000 n=10)
geomean          2.045Ki        1.763Ki       -13.77%
               │   old.txt    │               new.txt                │
               │  allocs/op   │ allocs/op   vs base                  │
_filter-8         1155.0 ± 0%   864.0 ± 0%  -25.19% (p=0.000 n=10)
_filterLen-8      1004.0 ± 0%   749.0 ± 0%  -25.40% (p=0.000 n=10)
_filterFirst-8    12.000 ± 0%   4.000 ± 0%  -66.67% (p=0.000 n=10)
_filterLast-8      24.00 ± 0%   24.00 ± 0%        ~ (p=1.000 n=10) ¹
_filterMap-8     123.000 ± 0%   9.000 ± 0%  -92.68% (p=0.000 n=10)
geomean            132.7        56.17       -57.66%
¹ all samples are equal

Further comments

As the Scope struct is bigger, so is the B/op for filterFirst and filterLast. They are faster still.

Iteration over slices previously used reflection to access elements, which was slow and allocated unnecessarily. This change adds type-specialized fast paths for common slice types ([]int, []float64, []string, []any) that bypass reflection entirely. Scope objects are now pooled and reused across loop iterations. The current scope pointer is cached to avoid repeated slice lookups. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP perf(vm): optimize loop iteration with scope pool #908

WIP perf(vm): optimize loop iteration with scope pool #908

Uh oh!

thevilledev commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

WIP perf(vm): optimize loop iteration with scope pool #908

Are you sure you want to change the base?

WIP perf(vm): optimize loop iteration with scope pool #908

Uh oh!

Conversation

thevilledev commented Jan 19, 2026

Motivation

Changes

Further comments

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant