MemLimiter

memlimiter helps a Go service avoid OOM by combining adaptive GC tuning and request throttling under memory pressure.

It observes process memory (RSS) and Go heap pressure (runtime.MemStats.NextGC) and turns that into:

dynamic debug.SetGCPercent tuning,
optional debug.SetMemoryLimit application on service start,
request shedding / backpressure via middleware.

By default, stats come from:

runtime.ReadMemStats for Go heap state,
gopsutil for process RSS.

For cgo/external-memory workloads, applications should provide their own stats.ServiceStatsSubscription and report non-Go allocations through ConsumptionReport.Cgo.

The repo also includes:

gRPC middleware for admission control,
an allocator demo under test/allocator,
integration tests and plotting scripts.

Is this still needed on Go 1.26.1?

For pure-Go services, usually not as a first step: start with GOMEMLIMIT / runtime/debug.SetMemoryLimit and standard admission control (see SetMemoryLimit and Go 1.19 runtime notes).

For cgo-heavy or mixed-memory services, it can still be useful because the Go memory limit does not account for external C allocations. In that setup, memlimiter can reduce the Go-side budget as external memory grows and apply backpressure.

When memlimiter fits best in 2026

You need explicit accounting of external/cgo memory.
You want dynamic Go-side budget reduction.
You need request shedding under pressure.

Go memory references

Working principles

MemLimiter is a memory-budget automated control system that combines:

Garbage collection intensification. The more often GC starts, the more garbage is collected, so fewer new physical allocations are needed for business logic.
Request throttling. By suppressing part of incoming requests, middleware applies backpressure and reduces allocation pressure.

Memory budget utilization

The core of the MemLimiter is a special object quite similar to P-controller, but with certain specifics (more on that below). Memory budget utilization value acts as an input signal for the controller. We define the $Utilization$ as follows:

$$ Utilization = \frac {NextGC} {RSS_{limit} - CGO} $$

where:

$NextGC$ (from here) is a target size for heap, upon reaching which the Go runtime will launch the GC next time;
$RSS_{limit}$ is a hard limit for service's physical memory (RSS) consumption (so that exceeding this limit will highly likely result in OOM);
$CGO$ is a total size of heap allocations made beyond Cgo borders (within C/C++/.... libraries).

A few notes about $CGO$ component. Allocations made outside of the Go allocator, of course, are not controlled by the Go runtime in any way. At the same time, the memory consumption limit is common for both Go and non-Go allocators. Therefore, if non-Go allocations grow, all we can do is shrink the memory budget for Go allocations (which is why we subtract $CGO$ from the denominator of the previous expression). If your service uses Cgo, you need to figure out how much memory is allocated "on the other side" - otherwise MemLimiter won't be able to save your service from OOM.

When reported $CGO >= RSS_{limit}$ , MemLimiter treats Go budget as exhausted and immediately switches to conservative control mode.

If the service doesn't use Cgo, the $Utilization$ formula is simplified to: $$Utilization = \frac {NextGC} {RSS_{limit}}$$

Control function

The controller converts the input signal into the control signal according to the following formula:

$$ K_{p} = C_{p} \cdot \frac {1} {1 - Utilization} $$

This is not an ordinary definition for a proportional component of the PID-controller, but still the direct proportionality is preserved: the closer the $Utilization$ is to 1 (or 100%), the higher the control signal value. The main purpose of the controller is to prevent a situation in which the next GC launch will be scheduled when the memory consumption exceeds the hard limit (and this will cause OOM).

You can adjust the proportional component control signal strength using a coefficient $C_{p}$. In addition, there is optional exponential averaging of the control signal. This helps to smooth out high-frequency fluctuations of the control signal (but it hardly eliminates self-oscillations).

The control signal is always saturated to prevent extremal values:

$$ Output = \begin{cases} \displaystyle 99 \ \ \ K_{p} \gt 99 \\ \displaystyle 0 \ \ \ \ \ \ \ K_{p} \lt 0 \\ \displaystyle K_{p} \ \ \ \ otherwise \\ \end{cases}$$

Finally we convert the dimensionless quantity $Output$ into specific $GOGC$ (for the further use in debug.SetGCPercent) and $Throttling$ (percentage of suppressed requests) values, however, only if the $Utilization$ exceeds the specified limits:

$$ GOGC = \begin{cases} \displaystyle max(MinGOGC, 100 - round(Output)) \ \ \ Utilization \ge DangerZoneGOGC \\ \displaystyle 100 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\ \end{cases}$$

$$ Throttling = \begin{cases} \displaystyle round(Output) \ \ \ Utilization \ge DangerZoneThrottling \\ \displaystyle 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ otherwise \\ \end{cases}$$

Implementation note: internal Utilization telemetry is a ratio (1.0 == 100%), while danger_zone_* settings are configured in percentage points ((0, 100]).

Architecture

The MemLimiter comprises two main parts:

Core implementing the memory budget controller and backpressure subsystems. Core relies on actual statistics provided by stats.ServiceStatsSubscription.
Middleware providing request throttling feature for various web frameworks. Every time the server receives a request, it uses middleware to ask the MemLimiter's core for permission to process this request. Currently, only gRPC is supported, but Middleware is an easily extensible interface, and PRs are welcome.

Quick start guide

For command workflows and expected outputs, see make-workflows.md.

Services without `Cgo`

Refer to the example service.

Services with `Cgo`

Refer to the example service.

You must also provide your own stats.ServiceStatsSubscription and stats.ServiceStats implementations. The latter one must return non-nil stats.ConsumptionReport instances if you want MemLimiter to consider allocations made outside of Go runtime allocator and estimate memory utilization correctly.

Tuning

There are several key settings in MemLimiter configuration (see top-level config and controller config):

Setting name	Type	Allowed range	Default	Description
`go_memory_limit`	bytes string (`"800M"`, `"1G"`, `"0"`)	`"0"` (disabled) or `(0, MaxInt64]` bytes	`0` (disabled)	Optional Go runtime soft memory limit applied via `debug.SetMemoryLimit` during service lifecycle.
`controller_nextgc.rss_limit`	bytes string	`(0, +inf)` bytes	none (required)	Hard process RSS budget used by the controller.
`controller_nextgc.danger_zone_gogc`	unsigned integer	`(0, 100]`	none (required)	Utilization threshold that enables GC tightening logic. Value `100` is emergency-only trigger (near-full-budget).
`controller_nextgc.danger_zone_throttling`	unsigned integer	`(0, 100]`	none (required)	Utilization threshold that enables request throttling. Value `100` is emergency-only trigger (near-full-budget).
`controller_nextgc.min_gogc`	integer	`0` (auto-default), or `[1, 100]`	`10` (when set to `0`)	Lower bound for computed `GOGC` in red zone.
`controller_nextgc.period`	duration string (`"100ms"`, `"1s"`)	`(0, +inf)` duration	none (required)	Controller loop period for control recomputation.
`controller_nextgc.component_proportional.coefficient` (`C_p`)	float	any non-zero value	none (required)	Proportional component strength (higher value means more aggressive reaction near limit).
`controller_nextgc.component_proportional.window_size`	unsigned integer	`[0, +inf)`	`0`	EMA smoothing window size for controller output (`0` disables smoothing).

Recommendation: keep danger_zone_throttling >= danger_zone_gogc so GC intensification starts before request shedding.
Implementation detail: current NextGC controller clamps output to 99, so maximum throttling emitted by this controller is 99%.

Example:

{
  "go_memory_limit": "800M",
  "controller_nextgc": {
    "rss_limit": "1G",
    "danger_zone_gogc": 50,
    "danger_zone_throttling": 90,
    "min_gogc": 10,
    "period": "100ms",
    "component_proportional": {
      "coefficient": 1,
      "window_size": 20
    }
  }
}

You have to pick them empirically for your service. The settings must correspond to the business logic features of a particular service and to the workload expected.

We made a series of performance tests with Allocator - an example service which does nothing but allocations that reside in memory for some time. We used different settings, applied the same load and tracked runtime behavior.

Current make allocator-analyze scenario matrix:

One unlimited baseline (memlimiter disabled).
One limited baseline without Go soft limit (go_memory_limit = 0).
Several limited cases with go_memory_limit = 800MiB, including a stricter safety floor (min_gogc = 30) case.

Common settings in this matrix:

$RSS_{limit} = {1G}$
$DangerZoneGOGC = 50%$
$DangerZoneThrottling = 90%$
$Period = 100ms$
$WindowSize = 20$

Scenario-specific values:

$go_memory_limit \in {0, 800MiB}$
$MinGOGC \in {10, 30}$
$C_{p} \in {0.5, 5, 10, 50}$

Load profile (same for all scenarios):

$RPS = 120$
$AllocationSize = 1MiB$
$PauseDuration = 6s$
$RequestTimeout = 1m$
$LoadDuration = 60s$

Current analyzer run outputs are generated under /tmp/allocator/allocator_<HHMMSS>/ (images below are curated examples from docs/):

And the summary RSS plot across tested scenarios:

Observed OOM behavior in this run:

Without MemLimiter (unlimited=true), the process terminates around ~16s under the 1GiB container limit.
With MemLimiter enabled, all limited scenarios sustain the full 60s load window.

Additional plots for new controls (go_memory_limit and min_gogc) are generated by make allocator-analyze in the same run directory. Curated examples are stored under docs/:

gogc_floor_hits.png:

What it means:

It shows, per scenario, the share of samples where GOGC is clamped by min_gogc.
Higher values mean the safety floor is actively protecting the process from dropping to overly aggressive GC values.
In this run, the strict case (C_p=50, min_gogc=30) hits the floor for ~78% of samples.

memory_limits_overlay.png:

What it means:

It shows RSS and Go runtime memory (tracked as MemStats.Sys - MemStats.HeapReleased) with configured limits over time.
go_memory_limit is a soft limit, so short-term overshoot is possible under bursty/high-allocation load.
If overshoot is large and persistent, allocation pressure is stronger than GC control for this workload.
If RSS stays high while Go runtime memory is low, pressure likely comes from non-Go allocations (Cgo/external memory), so better external accounting and/or stronger throttling is needed.

General observations from these experiments:

In the latest stress run, disabling MemLimiter (unlimited baseline) terminates around 16s under the 1GiB container limit, while limited scenarios complete the full 60s load.
go_memory_limit=800MiB adds extra GC pressure as a soft target; in this stress test it is not a hard ceiling for Go runtime memory.
min_gogc protects against extreme GC aggressiveness by clamping controller output in red-zone periods.
A stricter floor (min_gogc=30) with aggressive C_p=50 shifts control toward stronger throttling (up to 99%) instead of further GC tightening.

Runtime settings changed by MemLimiter are restored on Service.Quit():

GOGC (debug.SetGCPercent)
go_memory_limit (if configured via debug.SetMemoryLimit)

TODO

Extend middleware.Middleware to support more frameworks.
Support popular Cgo allocators like Jemalloc or TCMalloc, parse their stats to provide information about Cgo memory consumption.

Your PRs are welcome!

Publications

Isaev V. A. Go runtime high memory consumption (in Russian). Evrone Go meetup. 2022.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MemLimiter

Is this still needed on Go 1.26.1?

When memlimiter fits best in 2026

Go memory references