Skip to content

docs: Local vs Remote evaluation decision guide #7206

@Holmus

Description

@Holmus

Problem

The current documentation explains local evaluation in one sentence: "the SDK periodically fetches an environment document and evaluates flags locally without making per-request calls to the Flagsmith API." This is insufficient for users deciding between local and remote evaluation, and leads to support tickets and engineering escalations when users encounter unexpected behavior after switching modes.

Despite relatively low ticket volume, local evaluation issues are disproportionately expensive to resolve - they frequently require engineering involvement, deep debugging sessions, and sometimes live calls to untangle. The questions are almost always the same set of misunderstandings that better documentation would prevent.

Real support patterns (anonymized)

We reviewed 16 support tickets related to local/remote evaluation from the last 5 months. These are the recurring questions and confusion points, quoted or paraphrased from real tickets:

1. API call volume doesn't drop as expected

"We recently enabled local evaluation and we're seeing impressive latency improvements (50-100ms down to <5ms) but our API call volume is still high. What calls are still being made?"

This is the #1 question. Users expect local eval = zero API calls. They don't know about analytics/telemetry calls, identity sync calls, or environment document polling.

2. Identity overrides don't work in local eval

"Are identity overrides supposed to work with local eval mode? The UI does not allow me to apply identity overrides for identities which have been synced with local eval."

Users discover this limitation by trial and error. No docs explain it upfront.

3. Flag analytics show no data

"We plan to enable local evaluation for server/backend use cases. I'm unable to get a clear answer from docs about flag analytics. Will we still see flag usage data?"

"We're seeing discrepancies in flag usage data on the dashboard. For a lot of flags there seems to be no usage even though they're being evaluated thousands of times."

Users don't understand how analytics work differently in local mode.

4. Local eval doesn't work but remote does

"I'm debugging an issue with the Node.js library. Local eval doesn't work, remote eval works. Context is present. Can we debug together?"

These often require live debugging sessions. A troubleshooting checklist (correct key type, SDK version, environment document size) would save significant time.

5. Confusion about caching and freshness

"Can you speak about the 'cache' option in Swift, Kotlin, and JS SDKs?"

"Does realtime mean subsequent sync for environment/identity document is guaranteed to have latest data?"

Users conflate SDK-level caching with evaluation mode and real-time updates.

6. Percentage splits behave differently

"We activated a feature split at 50% with no overrides but the results don't match our analytics."

"Boolean flag value misreported with % split in segment" (turned out to be a local eval edge case)

Server-side evaluation with percentage splits behaves differently than users expect coming from remote evaluation.

7. Accidental high load from misconfiguration

"We're investigating pod restarts. A service was calling GET /api/v1/environment-document every request instead of using the SDK's built-in polling."

Users sometimes implement their own polling instead of using the SDK's local eval mode, causing self-inflicted load.

What's missing

Decision guide: When to use which mode

  • Remote: simpler setup, always up-to-date, suitable for low-traffic client-side use
  • Local: low latency, high throughput, suitable for server-side/backend services
  • Trade-offs: freshness vs. performance, simplicity vs. control

How local evaluation actually works

  • The SDK polls for an environment document on an interval (what interval? configurable?)
  • Flags are evaluated against the local copy of the environment document
  • What happens between polls (stale data window)
  • How the environment document is structured (helps users understand what's being cached)

Common gotchas (directly from the patterns above)

  • "Why am I still making API calls in local mode?" - Explain exactly which calls remain (analytics, identity sync, polling) and why.
  • "Identity overrides don't work" - Document this limitation explicitly, upfront, not as a surprise.
  • "Analytics show no data / incorrect data" - How analytics work differently in local mode. What gets tracked, what doesn't.
  • "Percentage splits give unexpected results" - Any differences in bucketing/hashing behavior between modes.
  • "My flag changes aren't reflected" - Polling interval, cache TTL, how to force a refresh.

Migration path

  • How to switch from remote to local evaluation
  • What to test after switching
  • How to verify it's working correctly (expected API call pattern)
  • Rollback plan if issues arise

Suggestion

Create a dedicated guide page (e.g., docs.flagsmith.com/guides/local-vs-remote-evaluation) rather than burying this in the SDK overview. Link it from:

  • Each server-side SDK's documentation
  • The SDK overview page (where the one-sentence explanation currently lives)
  • The support page

Include a decision table:

| Factor             | Remote             | Local                |
|-------------------|--------------------|--------------------|
| Latency           | Network round trip | Sub-millisecond    |
| Freshness         | Always current     | Polling interval   |
| API calls         | Per evaluation     | Periodic poll only |
| Analytics         | Automatic          | [explain behavior] |
| Identity overrides| Full support       | [explain behavior] |
| Best for          | Client-side, low traffic | Server-side, high traffic |

Why this matters

Users who adopt local evaluation are typically scaling their Flagsmith usage - moving from "trying it out" to "running it in production at scale." This is a critical adoption moment. Getting stuck here, or encountering unexpected behavior, risks users reverting to remote evaluation (which may not meet their performance needs) or looking for alternatives. Clear documentation at this stage directly supports retention of high-value users.

These tickets also disproportionately involve engineering time (CTO-level debugging sessions, same-day SDK patches). Better docs here directly reduces engineering support load.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docsDocumentation updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions