Skip to content

Commit c2d732b

Browse files
author
Ubuntu
committed
Add changes from usestrix/strix PR usestrix#334
1 parent 5102b64 commit c2d732b

4 files changed

Lines changed: 744 additions & 0 deletions

File tree

Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
---
2+
name: edge-cases
3+
description: Edge case testing for caching races, partial failures, boundary conditions, and eventual consistency exploitation
4+
category: vulnerabilities
5+
tags: [caching, race-condition, partial-failure, edge-case]
6+
cwe: 362
7+
---
8+
9+
# Edge Cases
10+
11+
Edge case vulnerabilities arise at system boundaries: cache coherence gaps, partial failure states, retry storms, and consistency windows between distributed components. These bugs rarely appear in unit tests and require adversarial timing, ordering, and failure injection to surface.
12+
13+
## Attack Surface
14+
15+
**Caching Layers**
16+
- CDN / reverse proxy (Cloudflare, Fastly, Varnish, nginx)
17+
- Application cache (Redis, Memcached, in-process)
18+
- Database query cache
19+
- DNS cache and TTL manipulation
20+
- Browser and service worker caches
21+
22+
**Distributed State**
23+
- Eventual consistency windows between replicas
24+
- Cross-region replication lag
25+
- Message queue delivery guarantees (at-least-once, at-most-once)
26+
- Saga/compensation patterns in microservices
27+
28+
**Failure Boundaries**
29+
- Partial success in multi-step operations
30+
- Timeout and retry behavior
31+
- Circuit breaker states (closed, open, half-open)
32+
- Graceful degradation and fallback paths
33+
34+
**Boundary Conditions**
35+
- Integer overflow/underflow at limits
36+
- Pagination cursors at collection boundaries
37+
- Time zone transitions, DST, leap seconds
38+
- Unicode normalization and encoding edge cases
39+
40+
## High-Value Targets
41+
42+
- Authenticated CDN content served from shared cache without identity keys
43+
- Payment flows with partial capture/refund states
44+
- Inventory systems with reservation and release logic
45+
- Session stores with replication lag between regions
46+
- Rate limiters using distributed counters
47+
- Background job queues with retry and dead-letter handling
48+
- Search indexes with delayed consistency from primary stores
49+
50+
## Reconnaissance
51+
52+
### Cache Behavior Mapping
53+
54+
- Identify caching headers: Cache-Control, Vary, ETag, Age, X-Cache, X-Cache-Hit, CF-Cache-Status
55+
- Determine cache key composition: what headers, cookies, and query parameters are included
56+
- Test Vary header completeness: does it include Authorization, Cookie, Accept-Language?
57+
- Check for cache partitioning: do authenticated and unauthenticated requests share cache entries?
58+
- Map TTL values and revalidation behavior (stale-while-revalidate, stale-if-error)
59+
60+
### Consistency Model Discovery
61+
62+
- Identify which data stores use eventual consistency vs strong consistency
63+
- Map replication topology: primary/replica, multi-region, active-active
64+
- Determine read-after-write guarantees per endpoint
65+
- Check if reads are pinned to the write region or load-balanced across replicas
66+
- Look for consistency-related headers or query parameters (consistency=strong, read_preference)
67+
68+
### Failure Mode Enumeration
69+
70+
- Identify multi-step operations and their atomicity guarantees
71+
- Map retry policies: fixed, exponential backoff, jitter, max attempts
72+
- Check for idempotency key support and scope
73+
- Identify circuit breaker implementations and their state thresholds
74+
- Look for graceful degradation paths that weaken security controls
75+
76+
## Key Vulnerabilities
77+
78+
### Cache Poisoning Races
79+
80+
- **TOCTOU on CDN**: Inject a poisoned response (e.g., admin=true, elevated role) into the cache during the window between authentication check and response caching; subsequent users receive the poisoned cached response
81+
- **Cache key confusion**: Exploit differences in how the cache and origin parse URLs, headers, or query parameters to serve one user's cached response to another
82+
- **Vary header omission**: Origin returns user-specific content but Vary header does not include Authorization or Cookie; CDN caches and serves across identities
83+
- **Web cache deception**: Trick caching layer into storing authenticated response at a public path (e.g., /account/profile.css) by appending cacheable extensions
84+
- **Cache parameter cloaking**: Use unkeyed query parameters, headers, or cookies to influence response content while the cache key remains identical
85+
- **Host header poisoning**: Inject alternate Host header values to generate cached responses with attacker-controlled links or redirects
86+
- **Response splitting**: Inject headers that cause the cache to store a crafted response for a different URL
87+
88+
### Partial Failure Exploitation
89+
90+
- **Half-committed transactions**: In multi-service workflows (payment + inventory + notification), one service commits while another fails; exploit the inconsistent state before compensation runs
91+
- **Orphaned resources**: Failed creation leaves allocated resources (IDs, reservations, storage objects) that can be claimed or referenced
92+
- **Retry amplification**: Trigger timeouts to force retries that cause duplicate side effects (double charges, double credits, duplicate emails)
93+
- **Compensation race**: Execute the compensation/rollback path before the original operation completes, leaving the system in a state that allows both the original action and its reversal to succeed
94+
- **Dead letter exploitation**: Messages in dead-letter queues may be reprocessed with stale context, outdated authorization, or bypassed validation
95+
- **Partial batch results**: Batch operations returning mixed success/failure per item; exploit items that succeeded before the batch was rolled back
96+
97+
### Eventual Consistency Windows
98+
99+
- **Read-your-writes violation**: Write to primary (e.g., revoke permission), immediately read from replica that has not replicated yet; stale read allows continued access
100+
- **Cross-region stale reads**: In multi-region deployments, act in a lagging region before a security-critical write propagates (role revocation, account disable, password change)
101+
- **Search index lag**: Item deleted or access revoked but still discoverable and accessible via search or listing endpoints backed by a delayed index
102+
- **Counter drift**: Distributed rate limit counters or quota trackers that diverge across nodes; burst requests across multiple nodes before counters converge
103+
104+
### Boundary Condition Abuse
105+
106+
- **Integer boundaries**: Quantity, price, or balance fields at INT_MAX/INT_MIN; overflow to negative or zero
107+
- **Pagination edge cases**: Cursor-based pagination allowing access to items beyond authorization scope when cursor encodes raw IDs; off-by-one at page boundaries exposing extra records
108+
- **Time boundary exploitation**: Exploit midnight UTC rollovers, DST transitions, or month-end boundaries where time-based access controls, quotas, or rate limits reset
109+
- **Encoding differentials**: Unicode normalization (NFC vs NFD), case folding, and homoglyph abuse causing different systems to interpret the same identifier differently (e.g., user lookup vs permission check)
110+
- **Floating point boundaries**: Currency calculations at precision limits producing rounding errors that accumulate across transactions
111+
- **Empty and null states**: Empty arrays, null values, missing fields, and zero-length strings bypassing validation that only checks for presence
112+
113+
### Graceful Degradation Weaknesses
114+
115+
- **Fallback path bypass**: When a dependency (auth service, rate limiter, WAF) is unavailable, the fallback allows requests through without full validation
116+
- **Circuit breaker open state**: While the circuit breaker is open, requests may be routed to a degraded path that skips authorization or logging
117+
- **Feature flag defaults**: Feature flags defaulting to enabled when the flag service is unreachable, exposing gated functionality
118+
- **Cache stampede**: Force cache expiry on a hot key; the thundering herd of requests to the origin may overwhelm the backend and trigger degraded responses
119+
120+
### Stale State and Revocation Gaps
121+
122+
- **Token revocation lag**: Access tokens remain valid until expiry even after revocation event; long-lived tokens with no revocation check
123+
- **Permission cache staleness**: Role or permission changes not reflected until cache TTL expires; act within the stale window
124+
- **DNS rebinding**: Manipulate DNS TTL to point a validated hostname to an internal IP after the initial security check
125+
126+
## Bypass Techniques
127+
128+
- Timing manipulation: slow down requests (large payloads, keep-alive abuse) to widen race windows
129+
- Regional routing: target specific regions or replicas known to lag behind the primary
130+
- Header injection to influence cache behavior (X-Forwarded-Host, X-Original-URL)
131+
- Trigger dependency failures (connection exhaustion, timeout injection) to force degraded paths
132+
- Replay stale pagination cursors or continuation tokens after access revocation
133+
134+
## Testing Methodology
135+
136+
1. **Map caching layers** - Identify all caches (CDN, app, DB), their key composition, TTLs, and Vary headers
137+
2. **Test cache isolation** - Verify authenticated content is not served cross-user; strip cookies, swap tokens, check ETags
138+
3. **Probe consistency** - Write then immediately read from different paths/regions; measure replication lag
139+
4. **Inject failures** - Simulate partial failures in multi-step operations; check for orphaned or inconsistent state
140+
5. **Test boundaries** - Exercise integer limits, pagination edges, time boundaries, and encoding variants
141+
6. **Force degradation** - Exhaust dependencies to trigger fallback paths; verify security controls remain enforced
142+
7. **Measure revocation** - Change permissions/roles and measure how long stale access persists across all layers
143+
144+
## Validation
145+
146+
1. Show cross-user cache serving: two different authenticated users receiving each other's cached responses
147+
2. Demonstrate partial failure leaving exploitable state (e.g., payment captured but order not created, allowing re-order)
148+
3. Prove stale read after security-critical write (permission revocation still allowing access via replica)
149+
4. Show boundary condition causing invariant violation (integer overflow, pagination leak, time-boundary quota reset)
150+
5. Demonstrate degraded path bypassing security control that is enforced in the normal path
151+
6. All findings must show durable state change or information disclosure, not just transient anomalies
152+
153+
## False Positives
154+
155+
- Intentional stale-while-revalidate behavior documented in architecture with acceptable staleness window
156+
- Eventual consistency windows within documented SLA that do not affect security-critical state
157+
- Cache serving public content (truly non-personalized) to multiple users as designed
158+
- Graceful degradation with explicit fail-closed behavior on security-critical paths
159+
- Pagination showing slightly stale counts due to known replica lag without access control implications
160+
161+
## Impact
162+
163+
- Cross-user data exposure via cache poisoning or confusion
164+
- Financial loss from partial failure exploitation (double-spend, orphaned charges)
165+
- Unauthorized access during consistency windows after revocation events
166+
- Denial of service via cache stampede or retry storm amplification
167+
- Policy bypass when security controls degrade under failure conditions
168+
169+
## Pro Tips
170+
171+
1. Cache bugs are most impactful on CDNs; start by mapping cache key composition and Vary headers
172+
2. For consistency bugs, identify the replication topology first; then target the lagging component
173+
3. Partial failures are easiest to trigger in payment and inventory flows; these have the highest business impact
174+
4. Test revocation effectiveness by measuring the actual window between revocation and enforcement
175+
5. Degrade one dependency at a time and check if security controls still hold
176+
6. Integer boundary bugs are often in quantity, price, and balance fields; try MAX_INT, 0, -1, and overflow values
177+
7. Time-boundary bugs cluster around midnight UTC, month-end, and DST transitions
178+
8. Cache deception works best when the origin and CDN disagree on what constitutes a static resource
179+
9. Use correlation IDs and timestamps in all test requests to prove ordering and causality
180+
10. Document the exact timing window required to reproduce; edge cases must be repeatable to be actionable
181+
182+
## Summary
183+
184+
Edge cases exploit the gaps between components that each work correctly in isolation but fail under adversarial timing, ordering, or partial failure. Security must hold at every cache boundary, consistency window, failure mode, and numeric limit in the system.

0 commit comments

Comments
 (0)