Skip to content

feat: add weight-based rate limiting support to HttpRequestRegexMatcher#966

Merged
Daryna Ishchenko (darynaishchenko) merged 9 commits intomainfrom
devin/1774609130-cost-based-rate-limiting
Mar 30, 2026
Merged

feat: add weight-based rate limiting support to HttpRequestRegexMatcher#966
Daryna Ishchenko (darynaishchenko) merged 9 commits intomainfrom
devin/1774609130-cost-based-rate-limiting

Conversation

@darynaishchenko
Copy link
Copy Markdown
Contributor

@darynaishchenko Daryna Ishchenko (darynaishchenko) commented Mar 27, 2026

Summary

Adds an optional weight field to HttpRequestRegexMatcher that allows specifying per-request weight in rate limiting policies. This enables weight-based rate limiting where different endpoints consume different amounts from a shared budget.

Motivation: APIs like Amplitude's Dashboard REST API use a cost model where different endpoints have different costs (e.g., /2/sessions/average = 60 cost/req, /2/users = 120 cost/req) sharing a single budget (108,000 cost/hour). Previously, APIBudget._do_acquire() hardcoded weight=1 even though the underlying pyrate_limiter already supports custom weights via try_acquire(weight=...). This change wires up the existing weight infrastructure.

Changes across 5 files (~170 lines):

  • YAML schema & Python model: New optional weight field (int or interpolated string) on HttpRequestRegexMatcher, defaults to None
  • call_rate.py: Stores weight on matcher with validation (weight >= 1), adds BaseCallRatePolicy.get_weight() to look up weight from first matching matcher (short-circuiting consistently with matches()), updates APIBudget._do_acquire() to pass dynamic weight to try_acquire(), adds weight guard to MovingWindowCallRatePolicy (rejects weights exceeding the lowest configured rate limit, matching FixedWindowCallRatePolicy's existing guard)
  • model_to_component_factory.py: Wires up weight field with InterpolatedString support for config-driven values; validates weight >= 1 after interpolation/casting
  • Tests: 11 new tests for weight-based rate limiting (matcher storage/validation, policy weight resolution, budget consumption, shared-budget interactions, moving-window weight guard rejection) + updates to existing mock-based test

Fully backward compatible — weight defaults to None (treated as weight=1).

Review & Testing Checklist for Human

  • get_weight() re-evaluates matchers in _do_acquire — After policy.matches(request) already confirmed a match, get_weight() re-iterates matchers. It short-circuits at the first match (consistent with matches()), but verify the double-evaluation doesn't cause issues with stateful matchers or edge cases where the two methods could disagree on which matcher matched.
  • get_weight() only checks HttpRequestRegexMatcher — The isinstance check means HttpRequestMatcher (non-regex variant) and custom RequestMatcher implementations will always get weight=1. Verify this is the intended scope.
  • InterpolatedString uses parameters={} — The factory passes an empty parameters dict, so {{ parameters.X }} won't resolve. Verify {{ config.X }} is sufficient for expected use cases.
  • No factory/interpolation test — There are no tests covering the create_http_request_matcher path with a string weight value (e.g., "{{ config['weight'] }}"). Consider whether this gap matters for confidence.
  • Recommended test plan: Run the full CDK test suite. Then manually verify with the source-amplitude connector (PR airbytehq/airbyte#75406) that weight values are correctly applied during a sync.

Notes

  • Related to airbytehq/airbyte#75406 (source-amplitude rate limit PR) which will use this feature.
  • The weight field accepts strings to support Jinja interpolation from connector config, though the primary use case is static integer values.
  • Field is named weight (not cost) for consistency with the existing try_acquire(weight=...) parameter in the CDK's rate limiter API.
  • Weight validation (>= 1) is enforced in both HttpRequestRegexMatcher.__init__ and create_http_request_matcher factory (catches invalid manifest config after interpolation/casting).
  • Both FixedWindowCallRatePolicy and MovingWindowCallRatePolicy now reject weights that exceed their configured limits with a deterministic ValueError, preventing infinite retry loops.

Summary by CodeRabbit

  • New Features
    • Matchers can specify an optional per-matcher "weight" (integer or templated string) used as the consumption cost against rate limits; omitted weights default to 1.
    • Rate limiting now uses matcher weights when acquiring capacity, affecting how calls consume shared budgets and enforcing that a request's weight cannot exceed configured limits.
  • Tests
    • Added tests for weight semantics, validation (weight >= 1), fallback/default behavior, weighted budget consumption, and shared-budget interactions.

Link to Devin session: https://app.devin.ai/sessions/642ec5f275684572aa83aa9c7da87444
Requested by: Daryna Ishchenko (@darynaishchenko)

Add optional 'cost' field to HttpRequestRegexMatcher that allows specifying
per-request cost/weight in rate limiting policies. This enables cost-based
rate limiting where different endpoints consume different amounts from a
shared budget (e.g., Amplitude's Dashboard REST API).

Changes:
- YAML schema: add 'cost' field (int or string) to HttpRequestRegexMatcher
- Python model: add 'cost' field to HttpRequestRegexMatcher model
- call_rate.py: store cost on matcher, add get_cost() to BaseCallRatePolicy,
  update APIBudget._do_acquire() to use cost as weight instead of hardcoded 1
- model_to_component_factory.py: wire up cost field with interpolation support
- Tests: 8 new tests for cost-based rate limiting behavior

Backward compatible: cost defaults to None (treated as 1).

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1774609130-cost-based-rate-limiting#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1774609130-cost-based-rate-limiting

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added an optional weight property to HttpRequestRegexMatcher (schema + model), resolved and forwarded during matcher creation, and used at runtime to compute per-request acquisition weight for rate-limiting instead of always using 1.

Changes

Cohort / File(s) Summary
Schema
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Added optional weight property to definitions.HttpRequestRegexMatcher (type: integer or string) with description documenting per-matcher request cost weight (defaults to 1 when unset).
Model
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Added weight: Optional[Union[int, str]] field to HttpRequestRegexMatcher Pydantic model with title/description metadata.
Factory / Parser
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
create_http_request_matcher resolves model.weight (evaluates interpolated strings against config, casts to int) and forwards computed weight to matcher constructor; validates weight >= 1.
Runtime / Rate Limiting
airbyte_cdk/sources/streams/call_rate.py
HttpRequestRegexMatcher accepts/stores weight and exposes it; validates weight >= 1; BaseCallRatePolicy.get_weight(request) returns the first matched matcher’s weight or 1; APIBudget._do_acquire computes request weight via policy and passes it to policy.try_acquire(...); __str__ updated to include weight.
Tests
unit_tests/sources/streams/test_call_rate.py
Updated mocks for weight-aware acquisitions; added TestWeightBasedRateLimiting covering matcher weight validation, policy weight resolution, budget consumption based on weight, fallback to default 1, shared-budget interactions, and policy-level rejection when weight exceeds configured limits.

Sequence Diagram

sequenceDiagram
    participant Client
    participant APIBudget
    participant Policy as BaseCallRatePolicy
    participant Matcher as HttpRequestRegexMatcher

    Client->>APIBudget: _do_acquire(request)
    APIBudget->>Policy: get_weight(request)
    Policy->>Matcher: matches(request)?
    alt Matcher found with weight
        Matcher-->>Policy: returns weight (int)
        Policy-->>APIBudget: weight
    else No match or weight unset
        Policy-->>APIBudget: 1 (default)
    end
    APIBudget->>Policy: try_acquire(weight)
    Policy-->>APIBudget: acquired (true/false)
    APIBudget-->>Client: permit or reject
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Would you like me to rename remaining "cost" occurrences to "weight" for consistency, wdyt?

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main feature: adding weight-based rate limiting to HttpRequestRegexMatcher, which aligns perfectly with the substantial changes across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1774609130-cost-based-rate-limiting

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

PyTest Results (Fast)

3 945 tests  +11   3 934 ✅ +11   7m 47s ⏱️ +48s
    1 suites ± 0      11 💤 ± 0 
    1 files   ± 0       0 ❌ ± 0 

Results for commit bf65b0d. ± Comparison against base commit acafc75.

♻️ This comment has been updated with latest results.

…quire(weight=...)

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
airbyte_cdk/sources/streams/call_rate.py (1)

627-628: Could you hoist weight out of the retry loop, wdyt?

policy.get_cost(request) is invariant for this call, but right now every retry re-runs matcher selection after get_matching_policy() has already matched the request. Computing it once before the loop trims an extra match/prepare pass from the hot path.

Small cleanup
         last_exception = None
         endpoint = self._extract_endpoint(request)
+        weight = policy.get_cost(request) if isinstance(policy, BaseCallRatePolicy) else 1
         # sometimes we spend all budget before a second attempt, so we have a few more attempts
         for attempt in range(1, self._maximum_attempts_to_acquire):
             try:
-                weight = policy.get_cost(request) if isinstance(policy, BaseCallRatePolicy) else 1
                 policy.try_acquire(request, weight=weight)
                 return
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/streams/call_rate.py` around lines 627 - 628, The call to
policy.get_cost(request) should be hoisted out of the retry loop because the
selected policy from get_matching_policy(request) is invariant for this request;
compute weight once after you obtain policy (using BaseCallRatePolicy check) and
store it in a local variable, then use that weight in every subsequent
policy.try_acquire(request, weight=weight) call inside the retry loop so matcher
selection/prepare isn't repeated on each retry.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)

1835-1844: Could you declare the interpolation context for cost, wdyt?

The runtime path now supports templated string costs, but this schema still exposes cost as a bare string. Adding the same interpolation context here keeps Builder/docs/schema-driven validation aligned with the new config-driven behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/declarative/declarative_component_schema.yaml` around
lines 1835 - 1844, The schema for the "cost" property needs an interpolation
context so templated string costs validate the same way at build/docs time as at
runtime; update the "cost" entry in declarative_component_schema.yaml (the
property named cost under the matcher block shown in the diff) to include the
same interpolation context metadata used by other templated string fields in
this schema (copy the interpolation context key/value pair used elsewhere for
templated strings and attach it to cost), ensuring both "type: string" and the
interpolation-aware validation are present.
unit_tests/sources/streams/test_call_rate.py (1)

366-455: Could you add one parser-level string cost test as well, wdyt?

These cases only construct HttpRequestRegexMatcher(..., cost=<int>) directly. The PR also adds config/interpolated cost support, so the factory/parser path can regress without any of these tests failing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unit_tests/sources/streams/test_call_rate.py` around lines 366 - 455, Add a
parser/factory-level test (e.g., test_parser_cost_string) that verifies a string
cost in config is parsed into an integer cost for HttpRequestRegexMatcher: use
the project’s config/parser/factory to create a MovingWindowCallRatePolicy (or
APIBudget) with a matcher whose cost is provided as a string (e.g., "60"), then
assert policy.get_cost(Request(...)) == 60 and/or that APIBudget.acquire_call
uses that weight (raise CallRateLimitHit when exceeded). Reference
HttpRequestRegexMatcher, MovingWindowCallRatePolicy and APIBudget in the test so
the parser path (not direct constructor) is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`:
- Around line 4390-4395: The cost resolution currently uses
InterpolatedString.create(cost, parameters={}) and doesn't enforce a minimum;
update the string interpolation to pass through the matcher parameters (use the
existing parameters variable instead of {} when calling
InterpolatedString.create) and after resolving/coercing to int (both for str and
non-str branches) clamp the value to at least 1 (e.g., cost = max(1,
int(resolved_cost))). Locate the `cost` handling block around the `cost =
model.cost` assignment and update the `InterpolatedString.create` call plus the
final assignment so `cost` is always an integer >= 1; keep the `eval(config)`
call for evaluation.

In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 169-181: Reject non-positive matcher costs by validating the
incoming cost before assigning self._cost: in the constructor where self._cost
is set, check if cost is not None and if cost <= 0 (or not an int/float if you
want stricter typing) and raise a ValueError with a clear message; this prevents
zero or negative weights from flowing into FixedWindowCallRatePolicy and
incorrectly making requests free or reducing _calls_num.

---

Nitpick comments:
In `@airbyte_cdk/sources/declarative/declarative_component_schema.yaml`:
- Around line 1835-1844: The schema for the "cost" property needs an
interpolation context so templated string costs validate the same way at
build/docs time as at runtime; update the "cost" entry in
declarative_component_schema.yaml (the property named cost under the matcher
block shown in the diff) to include the same interpolation context metadata used
by other templated string fields in this schema (copy the interpolation context
key/value pair used elsewhere for templated strings and attach it to cost),
ensuring both "type: string" and the interpolation-aware validation are present.

In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 627-628: The call to policy.get_cost(request) should be hoisted
out of the retry loop because the selected policy from
get_matching_policy(request) is invariant for this request; compute weight once
after you obtain policy (using BaseCallRatePolicy check) and store it in a local
variable, then use that weight in every subsequent policy.try_acquire(request,
weight=weight) call inside the retry loop so matcher selection/prepare isn't
repeated on each retry.

In `@unit_tests/sources/streams/test_call_rate.py`:
- Around line 366-455: Add a parser/factory-level test (e.g.,
test_parser_cost_string) that verifies a string cost in config is parsed into an
integer cost for HttpRequestRegexMatcher: use the project’s
config/parser/factory to create a MovingWindowCallRatePolicy (or APIBudget) with
a matcher whose cost is provided as a string (e.g., "60"), then assert
policy.get_cost(Request(...)) == 60 and/or that APIBudget.acquire_call uses that
weight (raise CallRateLimitHit when exceeded). Reference
HttpRequestRegexMatcher, MovingWindowCallRatePolicy and APIBudget in the test so
the parser path (not direct constructor) is exercised.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1b2adfec-6da5-4d49-875e-6582e514b3af

📥 Commits

Reviewing files that changed from the base of the PR and between acafc75 and 1b8450d.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
  • airbyte_cdk/sources/streams/call_rate.py
  • unit_tests/sources/streams/test_call_rate.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

4390-4395: ⚠️ Potential issue | 🟠 Major

Could we resolve weight with matcher parameters here, wdyt?

Using parameters={} means {{ parameters[...] }} weights cannot resolve, even though the rest of this factory usually threads matcher parameters into interpolation.

💡 Suggested change
     def create_http_request_matcher(
         self, model: HttpRequestRegexMatcherModel, config: Config, **kwargs: Any
     ) -> HttpRequestRegexMatcher:
         weight = model.weight
         if weight is not None:
+            interpolation_parameters = kwargs.get("parameters", model.parameters or {}) or {}
             if isinstance(weight, str):
-                weight = int(InterpolatedString.create(weight, parameters={}).eval(config))
+                weight = int(
+                    InterpolatedString.create(
+                        weight, parameters=interpolation_parameters
+                    ).eval(config=config)
+                )
             else:
                 weight = int(weight)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` around
lines 4390 - 4395, The code resolves model.weight with
InterpolatedString.create(..., parameters={}) which prevents matcher parameters
from being interpolated; update the call that creates/evaluates weight (the
block handling weight from model.weight using InterpolatedString.create and
.eval) to pass the real matcher parameters object instead of an empty dict
(e.g., parameters=matcher_parameters or the existing parameters variable used
elsewhere in this factory) so {{ parameters[...] }} templates are resolved.
airbyte_cdk/sources/streams/call_rate.py (1)

169-182: ⚠️ Potential issue | 🟠 Major

Could we reject weight <= 0 in the matcher ctor too, wdyt?

Direct Python usage bypasses the declarative factory, and FixedWindowCallRatePolicy currently treats 0 as free and negatives as budget-restoring.

💡 Suggested change
     def __init__(
         self,
         method: Optional[str] = None,
         url_base: Optional[str] = None,
         url_path_pattern: Optional[str] = None,
         params: Optional[Mapping[str, Any]] = None,
         headers: Optional[Mapping[str, Any]] = None,
         weight: Optional[int] = None,
     ):
@@
-        self._weight = weight
+        if weight is not None and weight < 1:
+            raise ValueError(f"weight must be >= 1, got {weight}")
+        self._weight = weight
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/streams/call_rate.py` around lines 169 - 182, The matcher
constructor currently accepts weight values <= 0 which leads to unexpected
semantics (0 treated as free, negatives as budget-restoring); in the __init__
where the weight parameter is assigned to self._weight validate that if weight
is not None it must be > 0 and raise a ValueError otherwise, updating the error
message to mention the matcher (weight) argument; ensure the same validation
logic is applied where RequestMatcher (or the specific matcher class in
call_rate.py) is instantiated so direct Python usage is rejected consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 279-295: get_weight currently scans all _matchers looking for any
weighted matcher, allowing a later matcher to override an earlier match; change
get_weight to stop at the first matcher that matches the request (same
short-circuit behavior as matches): in get_weight, iterate self._matchers and
when matcher(request) is True immediately return matcher.weight if the matcher
is an HttpRequestRegexMatcher with a non-None weight, otherwise return 1 for
that first matching matcher; this ensures get_weight respects the first-match
policy used by matches.

---

Duplicate comments:
In `@airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`:
- Around line 4390-4395: The code resolves model.weight with
InterpolatedString.create(..., parameters={}) which prevents matcher parameters
from being interpolated; update the call that creates/evaluates weight (the
block handling weight from model.weight using InterpolatedString.create and
.eval) to pass the real matcher parameters object instead of an empty dict
(e.g., parameters=matcher_parameters or the existing parameters variable used
elsewhere in this factory) so {{ parameters[...] }} templates are resolved.

In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 169-182: The matcher constructor currently accepts weight values
<= 0 which leads to unexpected semantics (0 treated as free, negatives as
budget-restoring); in the __init__ where the weight parameter is assigned to
self._weight validate that if weight is not None it must be > 0 and raise a
ValueError otherwise, updating the error message to mention the matcher (weight)
argument; ensure the same validation logic is applied where RequestMatcher (or
the specific matcher class in call_rate.py) is instantiated so direct Python
usage is rejected consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3e015976-ee05-4fb2-a685-e9072639b661

📥 Commits

Reviewing files that changed from the base of the PR and between 1b8450d and 07c11b0.

📒 Files selected for processing (5)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
  • airbyte_cdk/sources/streams/call_rate.py
  • unit_tests/sources/streams/test_call_rate.py
✅ Files skipped from review due to trivial changes (1)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 27, 2026

PyTest Results (Full)

3 948 tests  +11   3 936 ✅ +11   11m 15s ⏱️ +26s
    1 suites ± 0      12 💤 ± 0 
    1 files   ± 0       0 ❌ ± 0 

Results for commit bf65b0d. ± Comparison against base commit acafc75.

♻️ This comment has been updated with latest results.

@devin-ai-integration devin-ai-integration bot changed the title feat: add cost-based rate limiting support to HttpRequestRegexMatcher feat: add weight-based rate limiting support to HttpRequestRegexMatcher Mar 27, 2026
devin-ai-integration bot and others added 3 commits March 27, 2026 12:27
Address CodeRabbit review: get_weight() now stops at the first matcher
that matches the request, consistent with matches() behavior. If the
first matching matcher has a weight, use it; otherwise return 1.

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Reject non-positive weight values in the constructor to prevent
weight=0 (free requests) or negative weights (budget restoration)
from bypassing rate limits. Adds two tests for validation.

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Validates weight >= 1 after interpolation/casting in the factory,
catching invalid weights from manifest config before they reach
the HttpRequestRegexMatcher constructor.

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
airbyte_cdk/sources/streams/call_rate.py (1)

281-295: Could we make weight lookup go through the matcher abstraction here, wdyt?

This branch only honors HttpRequestRegexMatcher, so the convenience HttpRequestMatcher in this module still can't participate in weighted charging even though it delegates to a regex matcher underneath. The declarative path already forwards weight into HttpRequestRegexMatcher in airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:4388-4407; exposing weight on RequestMatcher/HttpRequestMatcher would keep both entry points aligned.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/streams/call_rate.py` around lines 281 - 295, The
get_weight function currently special-cases HttpRequestRegexMatcher; instead,
ask matchers for their weight via the matcher abstraction: update get_weight to,
for each matcher in self._matchers, call matcher(request) and then obtain the
weight by first trying a dedicated method/property on the matcher (e.g., call
matcher.get_weight(request) if present or read matcher.weight via getattr),
falling back to 1 when no weight is provided; reference the _matchers list,
get_weight, and the RequestMatcher/HttpRequestMatcher/HttpRequestRegexMatcher
types so the implementation aligns with both declarative and programmatic
matchers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 628-629: Current loop blindly forwards computed weight into every
policy; add a pre-check to reject weights that can never be satisfied by
moving-window policies before sleeping/retrying. Specifically, when computing
weight via get_weight(request), if the policy is a MovingWindowCallRatePolicy
call a guard (e.g., policy.can_fit(weight) or policy.validate_weight(weight)) or
inspect the moving-window config (max tokens/limit) and immediately raise a
deterministic exception (ValueError or a dedicated ConfigError) when weight >
policy's maximum capacity instead of calling try_acquire; keep behavior for
FixedWindowCallRatePolicy unchanged (it already fails fast).

---

Nitpick comments:
In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 281-295: The get_weight function currently special-cases
HttpRequestRegexMatcher; instead, ask matchers for their weight via the matcher
abstraction: update get_weight to, for each matcher in self._matchers, call
matcher(request) and then obtain the weight by first trying a dedicated
method/property on the matcher (e.g., call matcher.get_weight(request) if
present or read matcher.weight via getattr), falling back to 1 when no weight is
provided; reference the _matchers list, get_weight, and the
RequestMatcher/HttpRequestMatcher/HttpRequestRegexMatcher types so the
implementation aligns with both declarative and programmatic matchers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9af6f1e9-f670-4292-b0e2-c3f4710aa295

📥 Commits

Reviewing files that changed from the base of the PR and between 9bdf9db and fea3ca1.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/streams/call_rate.py
  • unit_tests/sources/streams/test_call_rate.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/streams/test_call_rate.py

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)

4390-4397: ⚠️ Potential issue | 🟠 Major

Could we pass the matcher interpolation parameters into weight here, wdyt?

Line 4393 still builds InterpolatedString with parameters={}, so weights like {{ parameters.request_weight }} won't resolve even though string weights are supported. That leaves the new interpolation path broken for matcher/parent-parameter use cases.

💡 Possible patch
         weight = model.weight
         if weight is not None:
+            interpolation_parameters = model.parameters or kwargs.get("parameters", {}) or {}
             if isinstance(weight, str):
-                weight = int(InterpolatedString.create(weight, parameters={}).eval(config))
+                weight = int(
+                    InterpolatedString.create(
+                        weight,
+                        parameters=interpolation_parameters,
+                    ).eval(config=config)
+                )
             else:
                 weight = int(weight)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py` around
lines 4390 - 4397, The string-weight branch currently builds InterpolatedString
with parameters={} so expressions like "{{ parameters.request_weight }}" won't
resolve; change the InterpolatedString.create call for model.weight to pass the
actual interpolation parameters (e.g., parameters) instead of an empty dict and
then eval against config as before — update the InterpolatedString.create(...)
call that constructs weight to use parameters=parameters (or the correct
surrounding parameter variable) so matcher/parent-parameter weights interpolate
correctly.
🧹 Nitpick comments (1)
airbyte_cdk/sources/streams/call_rate.py (1)

633-634: Could we resolve weight once before the retry loop, wdyt?

Line 633 recomputes the matcher-derived weight on every retry. Since the request and selected policy are fixed for a single _do_acquire() call, moving this lookup above the loop would avoid repeated matcher scans on a hot path.

💡 Possible patch
         last_exception = None
         endpoint = self._extract_endpoint(request)
+        weight = policy.get_weight(request) if isinstance(policy, BaseCallRatePolicy) else 1
         # sometimes we spend all budget before a second attempt, so we have a few more attempts
         for attempt in range(1, self._maximum_attempts_to_acquire):
             try:
-                weight = policy.get_weight(request) if isinstance(policy, BaseCallRatePolicy) else 1
                 policy.try_acquire(request, weight=weight)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@airbyte_cdk/sources/streams/call_rate.py` around lines 633 - 634, In
_do_acquire (call_rate.py) compute the weight once before entering the retry
loop instead of inside the loop: determine weight = policy.get_weight(request)
if isinstance(policy, BaseCallRatePolicy) else 1 prior to retrying, then call
policy.try_acquire(request, weight=weight) on each attempt; this avoids repeated
matcher scans while keeping the same policy and request semantics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py`:
- Around line 4390-4397: The string-weight branch currently builds
InterpolatedString with parameters={} so expressions like "{{
parameters.request_weight }}" won't resolve; change the
InterpolatedString.create call for model.weight to pass the actual interpolation
parameters (e.g., parameters) instead of an empty dict and then eval against
config as before — update the InterpolatedString.create(...) call that
constructs weight to use parameters=parameters (or the correct surrounding
parameter variable) so matcher/parent-parameter weights interpolate correctly.

---

Nitpick comments:
In `@airbyte_cdk/sources/streams/call_rate.py`:
- Around line 633-634: In _do_acquire (call_rate.py) compute the weight once
before entering the retry loop instead of inside the loop: determine weight =
policy.get_weight(request) if isinstance(policy, BaseCallRatePolicy) else 1
prior to retrying, then call policy.try_acquire(request, weight=weight) on each
attempt; this avoids repeated matcher scans while keeping the same policy and
request semantics.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8aa79a64-0c99-4294-941e-d749bbce9ff6

📥 Commits

Reviewing files that changed from the base of the PR and between fea3ca1 and d8686b1.

📒 Files selected for processing (3)
  • airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
  • airbyte_cdk/sources/streams/call_rate.py
  • unit_tests/sources/streams/test_call_rate.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • unit_tests/sources/streams/test_call_rate.py

…Policy

Co-Authored-By: Daryna Ishchenko <darina.ishchenko17@gmail.com>
@darynaishchenko Daryna Ishchenko (darynaishchenko) merged commit 007066b into main Mar 30, 2026
27 of 28 checks passed
@darynaishchenko Daryna Ishchenko (darynaishchenko) deleted the devin/1774609130-cost-based-rate-limiting branch March 30, 2026 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants