feat(hosted key): Add hosted key, exa as shared key by TheodoreSpeaks · Pull Request #3221 · simstudioai/sim

TheodoreSpeaks · 2026-02-14T07:03:44Z

Summary

Added hosted api key integration for exa search.
For hosted sim, api sub block no longer appears for exa search. Users can use the BYOK modal to pass an api key if needed. Non-hosted sim behaves unchanged.

Added throttling per billing actor. Requests are throttled proactively, failing fast if they exceed limits (users can switch to BYOK if they need higher limits). Can be either a limit based on requests per minute or a custom limit using the api response. If custom, rate limiting is optimistic and only applies after execution completes.

Moved cost modification behavior to knowledge block directly instead of transforming each block separately.

Hosted key calls retry 3 times with exponential backoff to handle throttles from increased load, with alarms.

Important: API keys need to be stored in prod and staging before merging:

EXA_API_KEY_COUNT number
EXA_API_KEY_{number} api keys (traffic spread evenly via round robin)

Type of Change

Testing

Tested non-host still has required api subblock.
Tested hosted sim does not have api subblock
Tested hosted key updates user stats and usage log tables correctly
Unit tested throttling handles correctly
Tested rate limiting works successfully for exa search with hosted key and does not apply when BYOK is applied.

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Screenshots/Videos

vercel · 2026-02-14T07:03:49Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Mar 6, 2026 4:10am

greptile-apps · 2026-02-14T07:09:30Z

Greptile Summary

This PR introduces hosted API key support for Exa search tools, allowing Sim to provide its own Exa API keys on the hosted platform so users don't need to supply their own. It includes a token-bucket rate limiter per billing actor (workspace), least-loaded key selection across a configurable key pool, exponential backoff retry on external rate limits, custom pricing calculation from Exa's costDollars response field, and a BYOK override path so users can always substitute their own key.

Key changes:

New ToolHostingConfig type on ToolConfig to declare hosted key behavior for any tool
HostedKeyRateLimiter with per-actor token buckets (DB-backed) and optional multi-dimension usage tracking
injectHostedKeyIfNeeded in executeTool handles BYOK check, rate limiting, and key selection before each request
Exa's API key subblock is now hidden in the block editor and serializer when running on hosted Sim (hideWhenHosted: true)
Cost and usage logging via logFixedUsage after each successful hosted-key execution
BYOKProviderId centralized to tools/types.ts and exa added as a BYOK provider

Issues found:

The ExaSearchResponse, ExaGetContentsResponse, ExaFindSimilarLinksResponse, and ExaAnswerResponse types declare costDollars? (no underscore) but the implementations set _costDollars (with underscore); the typed field is never populated
stripInternalFields is applied unconditionally to all tool outputs, not just hosted-key paths — this silently drops any underscore-prefixed field across the entire tool registry
isRateLimitError treats HTTP 503 as retriable, which can cause 3 unnecessary retry attempts (up to ~7 seconds total delay) when Exa returns a genuine service-unavailable response
The in-memory keyRequestCounts singleton provides no load distribution across serverless instances; each cold-started instance independently routes all traffic to key index 0 first
lib/api-key/byok.ts now imports from tools/types.ts, inverting the expected lib ← tools dependency direction
The post-execution cost block is copy-pasted into two separate code paths rather than extracted into a shared helper

Confidence Score: 3/5

Merging is relatively safe but two correctness issues should be addressed before or shortly after merge: the type mismatch on Exa response types and the unconditional stripping of underscore-prefixed fields from all tool outputs.
The core rate-limiting, BYOK fallback, and cost-logging logic is sound and well-tested. However, the stripInternalFields function being applied to all tool executions (not just hosted-key paths) is a silent breaking change for any existing tool with underscore-prefixed output fields. The type mismatch between costDollars and _costDollars in the Exa response interfaces is also a correctness issue. The in-memory singleton and 503-retry concerns are operational risks rather than launch blockers.
apps/sim/tools/index.ts (unconditional field stripping + duplicated cost block), apps/sim/tools/exa/types.ts (costDollars vs _costDollars type mismatch)

Important Files Changed

Filename	Overview
apps/sim/tools/index.ts	Core hosted key injection logic added; contains two issues: `stripInternalFields` applied unconditionally to all tools (not just hosted-key paths), and the post-execution cost block is duplicated across internal/external code paths. `isRateLimitError` also retries on 503 which may cause excessive delays on genuine outages.
apps/sim/tools/exa/types.ts	Type declarations add `costDollars?` (without underscore) to response interfaces, but all four Exa tool implementations set `_costDollars` (with underscore). The declared field is never populated; the actual internal field is untyped.
apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts	Well-structured rate limiter with per-actor token buckets and multi-dimension support; the in-memory `keyRequestCounts` singleton will not distribute load correctly in a multi-instance serverless environment.
apps/sim/tools/types.ts	New hosting config types (`ToolHostingConfig`, `PerRequestPricing`, `CustomPricing`) and `BYOKProviderId` are well-defined; moving `BYOKProviderId` here causes `lib` modules to import from `tools`, inverting the typical dependency direction.
apps/sim/lib/api-key/byok.ts	Simple change to import `BYOKProviderId` from `tools/types` instead of defining it locally; this introduces a `lib` → `tools` dependency that inverts the expected module hierarchy.
apps/sim/blocks/blocks/exa.ts	Correctly splits the `apiKey` subblock into two variants: one with `hideWhenHosted: true` for non-research operations, and one always-visible for `exa_research` (which has no hosted key support).
apps/sim/lib/workflows/subblocks/visibility.ts	Clean addition of `isSubBlockHiddenByHostedKey` utility; correctly reads the compile-time `isHosted` flag.
apps/sim/lib/core/rate-limiter/storage/db-token-bucket.ts	Changes ELSE clause in token-bucket SQL from returning the current token count (without consuming) to returning -1, fixing a silent bug where tokens weren't decremented when no refill was due.
apps/sim/tools/exa/search.ts	Adds `hosting` config with custom pricing (using `_costDollars` from Exa API response) and per-request rate limiting; fallback pricing logic for when `_costDollars` is missing looks reasonable.
apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.test.ts	Comprehensive test coverage for `acquireKey` and `reportUsage` including rate limiting, dimension exhaustion, partial key availability, and error handling edge cases.
apps/sim/executor/handlers/generic/generic-handler.ts	Removes the knowledge-block-specific cost transformation from the generic handler; this logic has been moved into `tools/knowledge/search.ts` and `upload_chunk.ts` where it belongs.
apps/sim/lib/billing/core/usage-log.ts	Widens `FixedUsageMetadata` from `Record<string, never>` to `Record<string, unknown>` and adds an optional `metadata` param to `LogFixedUsageParams` — straightforward extensibility change.

Sequence Diagram

sequenceDiagram
    participant User as User/Workflow
    participant ET as executeTool()
    participant BYOK as getBYOKKey()
    participant RL as HostedKeyRateLimiter
    participant DB as DB Token Bucket
    participant API as External API (Exa)

    User->>ET: executeTool(toolId, params, ctx)
    ET->>ET: injectHostedKeyIfNeeded()

    alt Tool has hosting config AND isHosted=true
        ET->>BYOK: check workspace BYOK key
        alt BYOK key found
            BYOK-->>ET: user key (no billing)
        else No BYOK key
            ET->>RL: acquireKey(provider, prefix, config, workspaceId)
            RL->>DB: consumeTokens(actor bucket, 1 token)
            DB-->>RL: allowed or denied
            alt Actor rate limited
                RL-->>ET: billingActorRateLimited=true
                ET-->>User: Error 429 (rate limited)
            else No hosted keys available
                RL-->>ET: success=false
                ET-->>User: Error 503 (no keys)
            else Key acquired
                RL-->>ET: hosted key value
                ET->>ET: inject key into params
            end
        end
    end

    ET->>API: executeWithRetry(executeToolRequest)
    loop Retry up to 3x on 429 or 503
        API-->>ET: 429 Too Many Requests
        ET->>ET: exponential backoff delay
    end
    API-->>ET: 200 OK with response data

    alt isUsingHostedKey AND success
        ET->>RL: reportUsage() for custom dimensions
        ET->>ET: calculateToolCost() via pricing config
        ET->>DB: logFixedUsage() for audit trail
        ET->>ET: attach cost to output
    end

    ET->>ET: stripInternalFields() remove underscore fields
    ET-->>User: ToolResult (cost attached if hosted key used)

Comments Outside Diff (2)

apps/sim/tools/exa/types.ts, line 47-50 (link)

Type mismatch: costDollars vs _costDollars

The response type declarations add costDollars? (without underscore), but all four Exa tool implementations (search.ts, get_contents.ts, find_similar_links.ts, answer.ts) set _costDollars (with underscore) in their transformResponse output. The declared field is never actually set, while the real internal field is not typed.

This mismatch means any TypeScript consumer trusting the type annotation for ExaSearchResponse.output.costDollars will find it always undefined, while the actual internal field (_costDollars) that drives billing is untyped.

The costDollars? field in the type interface should be changed to _costDollars? to match the implementation, or omitted entirely since it's an internal field stripped before returning to callers.

This same issue appears in ExaGetContentsResponse, ExaFindSimilarLinksResponse, and ExaAnswerResponse.
apps/sim/lib/api-key/byok.ts, line 18 (link)

lib importing from tools inverts the intended dependency direction

lib/api-key/byok.ts now imports BYOKProviderId from @/tools/types. Library code in lib/ is typically consumed by tools/, not the other way around. This creates an inward dependency (lib → tools) that could complicate future refactoring and makes the lib/ module less reusable independently of the tools layer.

Since BYOKProviderId is conceptually a cross-cutting type (used by both the BYOK mechanism and tool definitions), it would be better placed in a shared types module — e.g., lib/api-key/types.ts — and imported from there by both lib/api-key/byok.ts and tools/types.ts.

_{Last reviewed commit: a90777a}

greptile-apps

_{22 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

…or custom throttle rules.

cursor · 2026-03-06T03:22:35Z

PR Summary

High Risk
Touches core tool execution, billing usage logging, and rate-limiting storage semantics; mistakes could cause incorrect billing, throttling, or API key leakage/usage across workspaces.

Overview
Adds first-class hosted API key support to tool execution: tools can declare hosting config, and on hosted deployments executeTool now injects a workspace BYOK key if present or else selects a shared env key (round-robin) with per-workspace rate limiting, 429/503 handling, and exponential-backoff retries; hosted-key executions also log fixed-usage costs (with optional metadata) and strip internal __* fields from outputs.

Extends BYOK support to the new exa provider (API + settings UI), and introduces hideWhenHosted subblock gating so hosted deployments hide tool API-key fields in the editor, block preview, and serializer; the Exa block/tool configs are updated accordingly, including capturing Exa response cost breakdown for billing.

Refactors cost shaping: removes generic-handler output cost extraction and instead restructures cost/token/model fields in knowledge tool transformResponse; also broadens FixedUsageMetadata and adjusts DB token-bucket behavior to return a sentinel on insufficient tokens, plus adds telemetry + tests for the new hosted-key rate limiter and tool hosting behavior.

^{Written by Cursor Bugbot for commit 34cffdc. This will update automatically on new commits. Configure here.}

TheodoreSpeaks · 2026-03-06T03:26:12Z

apps/sim/executor/handlers/generic/generic-handler.ts

        throw error
      }

-      const output = result.output


Moved this to knowledge block transformation, so generic handler doesn't need to handle special cases

TheodoreSpeaks · 2026-03-06T03:26:53Z

apps/sim/lib/core/rate-limiter/storage/db-token-bucket.ts

                )::numeric
              ) - ${requestedTokens}::numeric
-              ELSE ${rateLimitBucket.tokens}::numeric
+              ELSE -1


DB rate limiting didn't work since tokens was always 0. Set the else case to -1 so rate limiting applies successfully.

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for 2 of the 3 issues found in the latest run.

✅ Fixed: Key request counters grow unbounded, never decrement
- Added key release/decrement logic in HostedKeyRateLimiter and wired executeTool to always release acquired hosted keys in a finally block so counters track in-flight load.
✅ Fixed: Hosted key injection ignores user-provided API key
- injectHostedKeyIfNeeded now returns early when the configured API-key parameter is already present, preserving user-supplied keys in hosted mode.

Or push these changes by commenting:

@cursor push 80f7b70ba9

Preview (80f7b70ba9)

diff --git a/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.test.ts b/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.test.ts
--- a/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.test.ts
+++ b/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.test.ts
@@ -155,6 +155,41 @@
       expect(r4.keyIndex).toBe(0) // Wraps back
     })
 
+    it('should rebalance key selection after releasing an in-flight key', async () => {
+      const allowedResult: ConsumeResult = {
+        allowed: true,
+        tokensRemaining: 9,
+        resetAt: new Date(Date.now() + 60000),
+      }
+      mockAdapter.consumeTokens.mockResolvedValue(allowedResult)
+
+      const r1 = await rateLimiter.acquireKey(
+        testProvider,
+        envKeyPrefix,
+        perRequestRateLimit,
+        'workspace-1'
+      )
+      const r2 = await rateLimiter.acquireKey(
+        testProvider,
+        envKeyPrefix,
+        perRequestRateLimit,
+        'workspace-2'
+      )
+
+      expect(r1.keyIndex).toBe(0)
+      expect(r2.keyIndex).toBe(1)
+
+      rateLimiter.releaseKey(testProvider, 0)
+
+      const r3 = await rateLimiter.acquireKey(
+        testProvider,
+        envKeyPrefix,
+        perRequestRateLimit,
+        'workspace-3'
+      )
+      expect(r3.keyIndex).toBe(0)
+    })
+
     it('should handle partial key availability', async () => {
       const allowedResult: ConsumeResult = {
         allowed: true,

diff --git a/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts b/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts
--- a/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts
+++ b/apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts
@@ -52,7 +52,7 @@
  */
 export class HostedKeyRateLimiter {
   private storage: RateLimitStorageAdapter
-  /** In-memory request counters per key: "provider:keyIndex" -> count */
+  /** In-memory in-flight request counters per key: "provider:keyIndex" -> count */
   private keyRequestCounts = new Map<string, number>()
 
   constructor(storage?: RateLimitStorageAdapter) {
@@ -346,6 +346,21 @@
     const key = `${provider}:${keyIndex}`
     this.keyRequestCounts.set(key, (this.keyRequestCounts.get(key) ?? 0) + 1)
   }
+
+  private decrementKeyCount(provider: string, keyIndex: number): void {
+    const key = `${provider}:${keyIndex}`
+    const current = this.keyRequestCounts.get(key)
+    if (!current) return
+    if (current <= 1) {
+      this.keyRequestCounts.delete(key)
+      return
+    }
+    this.keyRequestCounts.set(key, current - 1)
+  }
+
+  releaseKey(provider: string, keyIndex: number): void {
+    this.decrementKeyCount(provider, keyIndex)
+  }
 }
 
 let cachedInstance: HostedKeyRateLimiter | null = null

diff --git a/apps/sim/tools/index.test.ts b/apps/sim/tools/index.test.ts
--- a/apps/sim/tools/index.test.ts
+++ b/apps/sim/tools/index.test.ts
@@ -26,6 +26,7 @@
       acquireKey: vi.fn(),
       preConsumeCapacity: vi.fn(),
       consumeCapacity: vi.fn(),
+      releaseKey: vi.fn(),
     },
   })
 )
@@ -1338,6 +1339,73 @@
     Object.assign(tools, originalTools)
   })
 
+  it('should not overwrite user-provided API key in hosted mode', async () => {
+    mockIsHosted.value = true
+    const mockTool = {
+      id: 'test_user_key_hosted',
+      name: 'Test User Key Hosted',
+      description: 'A test tool with hosting config',
+      version: '1.0.0',
+      params: {
+        apiKey: { type: 'string', required: true, visibility: 'user-only' as const },
+      },
+      hosting: {
+        envKeyPrefix: 'TEST_API',
+        apiKeyParam: 'apiKey',
+        byokProviderId: 'exa',
+        pricing: {
+          type: 'per_request' as const,
+          cost: 0.005,
+        },
+        rateLimit: {
+          mode: 'per_request' as const,
+          requestsPerMinute: 100,
+        },
+      },
+      request: {
+        url: '/api/test/endpoint',
+        method: 'POST' as const,
+        headers: (params: any) => ({
+          'Content-Type': 'application/json',
+          'x-api-key': params.apiKey,
+        }),
+      },
+      transformResponse: vi.fn().mockResolvedValue({
+        success: true,
+        output: { result: 'success' },
+      }),
+    }
+
+    const originalTools = { ...tools }
+    ;(tools as any).test_user_key_hosted = mockTool
+
+    global.fetch = Object.assign(
+      vi.fn().mockImplementation(async () => ({
+        ok: true,
+        status: 200,
+        headers: new Headers(),
+        json: () => Promise.resolve({ success: true }),
+      })),
+      { preconnect: vi.fn() }
+    ) as typeof fetch
+
+    const mockContext = createToolExecutionContext()
+    const result = await executeTool(
+      'test_user_key_hosted',
+      { apiKey: 'user-provided-key' },
+      false,
+      mockContext
+    )
+
+    expect(result.success).toBe(true)
+    expect(mockGetBYOKKey).not.toHaveBeenCalled()
+    expect(mockRateLimiterFns.acquireKey).not.toHaveBeenCalled()
+    expect(mockRateLimiterFns.releaseKey).not.toHaveBeenCalled()
+
+    Object.assign(tools, originalTools)
+    mockIsHosted.value = false
+  })
+
   it('should use per_request pricing model correctly', async () => {
     const mockTool = {
       id: 'test_per_request_pricing',

diff --git a/apps/sim/tools/index.ts b/apps/sim/tools/index.ts
--- a/apps/sim/tools/index.ts
+++ b/apps/sim/tools/index.ts
@@ -40,6 +40,8 @@
 interface HostedKeyInjectionResult {
   isUsingHostedKey: boolean
   envVarName?: string
+  provider?: string
+  keyIndex?: number
 }
 
 /**
@@ -57,6 +59,10 @@
   if (!isHosted) return { isUsingHostedKey: false }
 
   const { envKeyPrefix, apiKeyParam, byokProviderId, rateLimit } = tool.hosting
+  const existingApiKey = params[apiKeyParam]
+  if (existingApiKey !== undefined && existingApiKey !== null && existingApiKey !== '') {
+    return { isUsingHostedKey: false }
+  }
 
   // Check BYOK workspace key first
   if (byokProviderId && executionContext?.workspaceId) {
@@ -132,6 +138,8 @@
   return {
     isUsingHostedKey: true,
     envVarName: acquireResult.envVarName,
+    provider,
+    keyIndex: acquireResult.keyIndex,
   }
 }
 
@@ -547,6 +555,7 @@
   const startTime = new Date()
   const startTimeISO = startTime.toISOString()
   const requestId = generateRequestId()
+  let hostedKeyInfo: HostedKeyInjectionResult = { isUsingHostedKey: false }
 
   try {
     let tool: ToolConfig | undefined
@@ -615,7 +624,7 @@
     }
 
     // Inject hosted API key if tool supports it and user didn't provide one
-    const hostedKeyInfo = await injectHostedKeyIfNeeded(
+    hostedKeyInfo = await injectHostedKeyIfNeeded(
       tool,
       contextParams,
       executionContext,
@@ -930,6 +939,18 @@
         duration,
       },
     }
+  } finally {
+    if (
+      hostedKeyInfo.isUsingHostedKey &&
+      hostedKeyInfo.provider &&
+      typeof hostedKeyInfo.keyIndex === 'number'
+    ) {
+      try {
+        getHostedKeyRateLimiter().releaseKey(hostedKeyInfo.provider, hostedKeyInfo.keyIndex)
+      } catch (error) {
+        logger.warn(`[${requestId}] Failed to release hosted key for ${toolId}:`, error)
+      }
+    }
   }
 }

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

cursor · 2026-03-06T03:27:16Z

apps/sim/tools/index.ts

+      contextParams,
+      executionContext,
+      requestId
+    )


Required param validation runs before hosted key injection

High Severity

validateRequiredParametersAfterMerge is called on line 610 before injectHostedKeyIfNeeded on line 618. In hosted mode, the UI hides the apiKey subblock via hideWhenHosted: true and the serializer skips it via isSubBlockHiddenByHostedKey, so contextParams won't contain apiKey. But all Exa tools declare apiKey as required: true in their params. Validation would reject the missing required param before the hosted key ever gets a chance to be injected. This likely breaks all Exa hosted-key tool executions.

Additional Locations (1)

apps/sim/tools/exa/search.ts#L84-L90

api key is passed via hosted key. Confirmed hosted key does pass validation and execute in local. This is not an issue.

Bugbot Autofix determined this is a false positive.

validateRequiredParametersAfterMerge only enforces required parameters with visibility: 'user-or-llm', so Exa's required apiKey (visibility: 'user-only') is not rejected before hosted-key injection.

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts

cursor · 2026-03-06T03:27:16Z

apps/sim/tools/index.ts

+  requestId: string
+): Promise<HostedKeyInjectionResult> {
+  if (!tool.hosting) return { isUsingHostedKey: false }
+  if (!isHosted) return { isUsingHostedKey: false }


Hosted key injection ignores user-provided API key

Low Severity

injectHostedKeyIfNeeded is documented as injecting a key only "if user didn't provide one," but it never checks whether params[apiKeyParam] is already populated. In hosted mode, if a user provides their own API key (e.g., via direct API call), the function unconditionally overwrites it with either a BYOK key or a hosted key, and the user gets billed for hosted key usage despite supplying their own.

api key cannot be provided in the ui if sim is hosted. Therefore, this condition will never occur.

greptile-apps · 2026-03-06T03:29:41Z

apps/sim/tools/index.ts

      throw new Error(`Tool not found: ${toolId}`)
    }

+    // Inject hosted API key if tool supports it and user didn't provide one
+    const hostedKeyInfo = await injectHostedKeyIfNeeded(
+      tool,
+      contextParams,


stripInternalFields is applied unconditionally to all tools

stripInternalFields removes all output keys starting with _ and is applied to every tool execution regardless of whether a hosted key was used. This is a behavior change that could silently break any existing tool whose output contains underscore-prefixed fields.

For example, if any other tool in the registry returns a field like _metadata, _debug, or _raw in its output, those fields would now be silently dropped without any warning or error.

The stripping should only occur when a hosted key was actually used (or at minimum scoped to Exa tools), since the underscore-prefix convention for internal fields is only established in this PR:

// Strip internal fields only when using hosted key if (hostedKeyInfo.isUsingHostedKey) { const strippedOutput = stripInternalFields(finalResult.output || {}) finalResult = { ...finalResult, output: strippedOutput } }

greptile-apps · 2026-03-06T03:29:42Z

apps/sim/tools/index.ts

+      maxRetries: 0,
+      delayMs: acquireResult.retryAfterMs ?? 0,
+      userId: executionContext?.userId,
+      workspaceId: executionContext?.workspaceId,
+      workflowId: executionContext?.workflowId,


503 used as a rate-limit retry signal — may cause excessive retries on genuine service failures

isRateLimitError treats HTTP 503 (Service Unavailable) as a retriable rate-limit error. This means if the Exa API is genuinely down (and returns 503), executeWithRetry will retry 3 times with exponential backoff before surfacing the error to the user. Each retry could incur up to an 8-second wait (1s + 2s + 4s), plus the actual API timeout, causing a noticeably slow failure.

HTTP 503 is sometimes used for rate limiting, but it can also indicate a real outage. Exa's actual rate-limit response is 429. Consider restricting retries to 429 only, unless there's a confirmed reason to expect 503 from Exa as a throttle signal:

Suggested change

maxRetries: 0,

delayMs: acquireResult.retryAfterMs ?? 0,

userId: executionContext?.userId,

workspaceId: executionContext?.workspaceId,

workflowId: executionContext?.workflowId,

function isRateLimitError(error: unknown): boolean {

if (error && typeof error === 'object') {

const status = (error as { status?: number }).status

if (status === 429) return true

}

return false

}

greptile-apps · 2026-03-06T03:29:43Z

apps/sim/lib/core/rate-limiter/hosted-key/hosted-key-rate-limiter.ts

+import { createLogger } from '@sim/logger'
+import {
+  createStorageAdapter,
+  type RateLimitStorageAdapter,
+  type TokenBucketConfig,
+} from '@/lib/core/rate-limiter/storage'
+import {
+  type AcquireKeyResult,
+  type CustomRateLimit,
+  DEFAULT_BURST_MULTIPLIER,


In-memory keyRequestCounts singleton won't distribute load across serverless instances

The keyRequestCounts Map and cachedInstance singleton are in-process memory. In a serverless/edge deployment (e.g., Vercel), each function invocation may run in an isolated container, so the least-loaded key selection only tracks requests seen by that specific instance. Different instances will independently increment their counters starting from 0, giving no guarantee of even distribution across the key pool.

The PR description says keys are distributed "via round robin", but the implementation is "least loaded in memory." In practice, each cold-started instance will always pick key index 0 first, which concentrates traffic on the first key until that instance warms up.

Consider replacing the in-memory counter with a distributed counter (e.g., using the existing RateLimitStorageAdapter) so load is balanced globally, or document this limitation explicitly.

greptile-apps · 2026-03-06T03:29:48Z

apps/sim/tools/index.ts

+    })
+
+    const error = new Error(acquireResult.error || `Rate limit exceeded for ${tool.id}`)
+    ;(error as any).status = 429
+    ;(error as any).retryAfterMs = acquireResult.retryAfterMs
+    throw error
+  }
+
+  // Handle no keys configured (503)
+  if (!acquireResult.success) {
+    logger.error(`[${requestId}] No hosted keys configured for ${tool.id}: ${acquireResult.error}`)
+    const error = new Error(acquireResult.error || `No hosted keys configured for ${tool.id}`)
+    ;(error as any).status = 503
+    throw error
+  }
+
+  params[apiKeyParam] = acquireResult.key
+  logger.info(`[${requestId}] Using hosted key for ${tool.id} (${acquireResult.envVarName})`, {
+    keyIndex: acquireResult.keyIndex,
+    provider,
+  })
+
+  return {
+    isUsingHostedKey: true,
+    envVarName: acquireResult.envVarName,
+  }
+}
+
+/**
+ * Check if an error is a rate limit (throttling) error
+ */
+function isRateLimitError(error: unknown): boolean {
+  if (error && typeof error === 'object') {
+    const status = (error as { status?: number }).status
+    // 429 = Too Many Requests, 503 = Service Unavailable (sometimes used for rate limiting)
+    if (status === 429 || status === 503) return true
+  }
+  return false
+}
+
+/** Context for retry with rate limit tracking */
+interface RetryContext {
+  requestId: string
+  toolId: string
+  envVarName: string


Post-execution cost & usage block is duplicated across two code paths

The block that calls reportCustomDimensionUsage, processHostedKeyCost, and applies the cost to finalResult.output appears twice — once in the internal-route code path and once in the external-route code path. This adds ~40 lines of identical logic that must be kept in sync manually.

Consider extracting this into a shared helper (e.g., applyHostedKeyCostToResult) that both paths call after finalResult is available.

apps/sim/tools/index.ts

cursor · 2026-03-06T03:46:56Z

apps/sim/lib/core/rate-limiter/storage/db-token-bucket.ts

                )::numeric
              ) - ${requestedTokens}::numeric
-              ELSE ${rateLimitBucket.tokens}::numeric
+              ELSE -1


Token bucket denial discards actual remaining token balance

Low Severity

When a consume request is denied (insufficient tokens), the ELSE branch unconditionally sets stored tokens to -1, discarding the actual remaining balance. For example, if a bucket has 3 tokens and a request for 5 is denied, the balance drops to -1 instead of remaining at 3. On the next refill, available tokens become refillRate - 1 instead of 3 + refillRate, causing a permanent loss of those 3 tokens. The denial case ideally preserves the refilled balance without consuming.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-06T04:16:00Z

apps/sim/tools/exa/search.ts

+        const costDollars = output.__costDollars as { total?: number } | undefined
+        if (costDollars?.total) {
+          return { cost: costDollars.total, metadata: { costDollars } }
+        }


Falsy check on cost causes incorrect fallback billing

Medium Severity

The getCost functions in all Exa tools use a truthiness check if (costDollars?.total) to decide whether to use the API-reported cost or a fallback estimate. Since 0 is falsy in JavaScript, a legitimate zero-cost response from Exa (e.g., cached results) would incorrectly fall through to the fallback pricing, billing the user a non-zero amount (e.g., $0.005) instead of $0. The check needs to be != null or !== undefined instead of a truthiness test.

Additional Locations (2)

apps/sim/tools/exa/answer.ts#L40-L44

apps/sim/tools/exa/find_similar_links.ts#L90-L94

cursor · 2026-03-06T04:16:00Z

apps/sim/tools/index.ts

+    if (status === 429 || status === 503) return true
+  }
+  return false
+}


Retry treats 503 as rate-limit causing unnecessary retries

Low Severity

isRateLimitError classifies HTTP 503 (Service Unavailable) as a rate-limit error, causing executeWithRetry to retry with exponential backoff on genuine server outages. This delays error reporting by up to ~7 seconds (1s + 2s + 4s) for non-throttle 503s, and could mask persistent outages since the error message returned to the user would still be opaque. Only 429 reliably indicates rate limiting.

Theodore Li added 12 commits February 12, 2026 18:32

feat(hosted keys): Implement serper hosted key

2cdb896

Handle required fields correctly for hosted keys

3e6527a

Add rate limiting (3 tries, exponential backoff)

e5c8aec

Add custom pricing, switch to exa as first hosted key

8a78f80

Add telemetry

d174a6a

Consolidate byok type definitions

c12e92c

Add warning comment if default calculation is used

2a36143

Record usage to user stats table

36e6464

Fix unit tests, use cost property

f237d6f

Include more metadata in cost output

0a002fd

Fix disabled tests

36d49ef

Fix spacing

fbd1cdf

TheodoreSpeaks requested a review from icecrasher321 February 14, 2026 07:03

greptile-apps bot reviewed Feb 14, 2026

View reviewed changes

Fix lint

dc4c611

vercel bot temporarily deployed to Preview February 14, 2026 16:56 Inactive

Move knowledge cost restructuring away from generic block handler

68da290

vercel bot temporarily deployed to Preview February 16, 2026 05:47 Inactive

TheodoreSpeaks changed the title ~~feat(hosted key): Add hosted key, exa as shared key4.~~ feat(hosted key): Add hosted key, exa as shared key Feb 16, 2026

Migrate knowledge unit tests

ce02a30

vercel bot temporarily deployed to Preview February 16, 2026 06:04 Inactive

Lint

e6d98c6

vercel bot temporarily deployed to Preview February 16, 2026 06:25 Inactive

TheodoreSpeaks marked this pull request as draft March 5, 2026 19:30

Theodore Li added 2 commits March 5, 2026 11:34

Fix broken tests

ecdbe29

Merge branch 'staging' into feat/sim-provided-key

2325535

vercel bot deployed to Preview March 5, 2026 19:51 View deployment

Add user based hosted key throttling

693a3d3

Theodore Li added 3 commits March 5, 2026 14:36

Refactor hosted key handling. Add optimistic handling of throttling f…

242d6e0

…or custom throttle rules.

Remove research as hosted key. Recommend BYOK if throtttling occurs

7b8e24e

Make adding api keys adjustable via env vars

cd160d3

vercel bot temporarily deployed to Preview March 6, 2026 01:28 Inactive

Theodore Li added 2 commits March 5, 2026 17:44

Remove vestigial fields from research

2082bc4

Make billing actor id required for throttling

a90777a

vercel bot temporarily deployed to Preview March 6, 2026 01:56 Inactive

TheodoreSpeaks marked this pull request as ready for review March 6, 2026 03:22

TheodoreSpeaks commented Mar 6, 2026

View reviewed changes

cursor bot reviewed Mar 6, 2026

View reviewed changes

greptile-apps bot reviewed Mar 6, 2026

View reviewed changes

Switch to round robin for api key distribution

d7ea0af

vercel bot temporarily deployed to Preview March 6, 2026 03:35 Inactive

Add helper method for adding hosted key cost

1c5425e

vercel bot temporarily deployed to Preview March 6, 2026 03:42 Inactive

cursor bot reviewed Mar 6, 2026

View reviewed changes

Strip leading double underscores to avoid breaking change

3832e5c

vercel bot temporarily deployed to Preview March 6, 2026 04:09 Inactive

Lint fix

34cffdc

vercel bot temporarily deployed to Preview March 6, 2026 04:10 Inactive

cursor bot reviewed Mar 6, 2026

View reviewed changes

Conversation

TheodoreSpeaks commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Important: API keys need to be stored in prod and staging before merging:

Type of Change

Testing

Checklist

Screenshots/Videos

Uh oh!

vercel bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

TheodoreSpeaks Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

TheodoreSpeaks Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Required param validation runs before hosted key injection

Uh oh!

TheodoreSpeaks Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Hosted key injection ignores user-provided API key

Uh oh!

TheodoreSpeaks Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Token bucket denial discards actual remaining token balance

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Falsy check on cost causes incorrect fallback billing

Uh oh!

TheodoreSpeaks commented Feb 14, 2026 •

edited

Loading

vercel bot commented Feb 14, 2026 •

edited

Loading

greptile-apps bot commented Feb 14, 2026 •

edited

Loading

cursor bot commented Mar 6, 2026 •

edited

Loading

cursor bot left a comment •

edited

Loading