Skip to content

Fix/manual captcha bootstrap#33

Open
truvvor wants to merge 120 commits into
nullcstring:mainfrom
truvvor:fix/manual-captcha-bootstrap
Open

Fix/manual captcha bootstrap#33
truvvor wants to merge 120 commits into
nullcstring:mainfrom
truvvor:fix/manual-captcha-bootstrap

Conversation

@truvvor

@truvvor truvvor commented Jun 9, 2026

Copy link
Copy Markdown

No description provided.

claude and others added 30 commits May 9, 2026 20:36
Builds TurnBridge on the self-hosted macOS/ARM64 runner with automatic
signing driven by an App Store Connect API key, exports an app-store IPA,
and uploads it to TestFlight via xcrun altool.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
The previous filter caused empty re-trigger commits to be skipped because
no path matched the filter at all.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Pipes archive/export output through tee so we can dump the tail of the
log inline on failure and always publish the full log as an artifact for
post-mortem on failed runs.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
…nbridge

Switches to a unique bundle identifier so the Apple Developer Team can
register it. With -allowProvisioningUpdates the archive step now lets
Xcode auto-create the bundle IDs, App Group (group.com.truvvor.turnbridge)
and provisioning profiles in the developer portal.

Updates:
- PRODUCT_BUNDLE_IDENTIFIER for both targets in pbxproj
- App Group entry in both .entitlements files
- Hard-coded fallbacks/log subsystems in Swift
- env defaults in the TestFlight workflow

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Self-hosted runner hits errSecInternalComponent during CodeSign because
the login keychain is locked / the API-issued cert lacks codesign in its
partition list. Add a step that, given MAC_KEYCHAIN_PASSWORD, unlocks the
keychain and grants apple-tool/apple/codesign access to private keys.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
- Unlock the keychain before any other security op so a locked state can
  no longer 36 the whole step.
- Treat set-key-partition-list and set-keychain-settings as non-fatal so
  benign warnings on a fresh keychain don't abort the build.
- Append the login keychain to the user search list instead of replacing
  it (System.keychain etc. stayed in the list this way).
- Surface per-command rc and a clearer error message when the password
  truly does not match the keychain.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
…xport signing

Cloud Managed App Distribution isn't available on the team's API key, so
xcodebuild -exportArchive with automatic signing fails to create an App
Store profile. Switch to manual signing for export and create the
underlying assets via plain ASC API calls (which work with App Manager
role) before each build:

  * script/ci_setup_signing.rb
      - reuses or creates an iOS Distribution cert via /v1/certificates,
        importing the resulting p12 into the login keychain;
      - reuses or creates IOS_APP_STORE provisioning profiles for the
        main app and extension via /v1/profiles, dropping their
        .mobileprovision into ~/Library/MobileDevice/Provisioning Profiles;
      - publishes profile names to GITHUB_ENV.

  * workflow
      - new step "Ensure Distribution cert and App Store profiles" runs
        the script before resolving SPM deps;
      - exportOptions.plist now uses signingStyle=manual with the profile
        names from the script and signingCertificate=Apple Distribution;
      - Export IPA no longer needs -allowProvisioningUpdates / API key
        flags since signing assets are pre-provisioned.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
System Ruby on the runner is 2.6 and 'gem install fastlane' fails (its
domain_name dep requires Ruby >= 2.7), so the script is now pure Ruby
stdlib + OpenSSL: hand-built ES256 JWT, Net::HTTP calls to
api.appstoreconnect.apple.com. No external gems, ~30 seconds instead of
10+ minutes.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
…xport

Two issues caused the export step to pick the wrong signing cert:
1. The user already had a working "Apple Distribution: ..." cert in
   keychain. The previous script ignored it and created an extra cert,
   leaving keychain with two same-named certs and the App Store profile
   bound only to the new one.
2. exportOptions.plist used signingCertificate="Apple Distribution",
   which is ambiguous when two such certs coexist.

Fixes:
- ci_setup_signing.rb now finds, by SHA-1, an existing keychain
  Distribution cert that matches an ASC API cert (preferring the latest
  expiration) and uses it. New cert creation only happens if there is
  no overlap between keychain and API.
- The chosen cert's SHA-1 is exported to GITHUB_ENV as SIGNING_CERT_SHA1.
- exportOptions.plist now references that SHA-1 so xcodebuild always
  picks the cert the profile is bound to.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
…he prompt

WireGuard relies on standard mass-market encryption (ChaCha20-Poly1305,
BLAKE2s, Curve25519), which qualifies for the US BIS 5D992 exemption.
Self-classifying it via Info.plist means App Store Connect won't ask the
export-compliance question for every uploaded build.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
When auto-solver gets BOT-flagged by VK there is no way to recover —
checkbox returns BOT, slider returns ERROR and the tunnel never gets
TURN credentials. Add a Settings toggle ("Solve captcha manually") that,
when enabled, bypasses the in-Go auto solver entirely and lets the user
complete the official VK NotRobot challenge in a WKWebView sheet in the
main app.

Plumbing:

* WireGuardKitGo / captcha_manual.go
    - exports TurnBridgeSetManualCaptchaCallback,
      TurnBridgeSubmitManualCaptchaToken,
      TurnBridgeCancelManualCaptcha,
      TurnBridgeSetManualCaptchaMode
    - requestManualCaptcha blocks on a Go channel until the Swift side
      delivers a success_token or cancels / times out
* WireGuardKitGo / vk_captcha.go
    - if manual mode is on, solveVkCaptcha skips PoW/checkbox/slider and
      calls requestManualCaptcha directly
* WireGuardKitGo / wireguard.h
    - declares the new exported functions for the Swift bridging module
* network-extension / CaptchaBridge.swift (new) and PacketTunnelProvider.swift
    - installs the cgo callback, persists pending requests to the App
      Group UserDefaults, fires a Darwin notification, routes app
      responses delivered via handleAppMessage back into Go;
    - reads the toggle from the App Group and configures the Go mode at
      tunnel start
* TurnBridge / ManualCaptchaSetting.swift, GlobalSettingsView.swift
    - new "Solve captcha manually" Toggle, persisted to the shared
      UserDefaults so the extension can pick it up
* TurnBridge / CaptchaManager.swift, CaptchaWebView.swift,
  TurnBridgeApp.swift, CaptchaIPC.swift (new)
    - observes the Darwin notification, surfaces a SwiftUI sheet with a
      WKWebView pointed at the captcha redirect_uri, injects JS that
      watches fetch/XHR/postMessage/URL for success_token, then ships
      the result back to the extension via
      NETunnelProviderSession.sendProviderMessage.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
@published and @ObservedObject require Combine. Without the explicit
import, the Swift frontend refused to type-check
TurnBridge/CaptchaManager.swift and CaptchaWebView.swift on iOS 26.4
SDK. Also tighten the UIApplication.didBecomeActive observer to capture
self locally before crossing into the Task.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
The first manual-captcha test showed an empty sheet. Adding explicit
.frame(maxWidth:maxHeight:.infinity) on the UIViewRepresentable + an
initial UIScreen-sized backing frame keeps WKWebView from collapsing
to zero in the SwiftUI layout. Navigation delegate now reports
provisional / finish / failure events into both SharedLogger and the
status banner, so a blank sheet at least tells us why.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
WKWebView in the host app cannot reach id.vk.ru while
includeAllNetworks is true and the tunnel is still in the Connecting
phase — iOS strictly blocks all egress, so the captcha challenge fails
with "internet appears to be offline" and the tunnel can never make
forward progress.

Trade kill-switch for captcha solvability when manual mode is on. Auto
mode keeps the previous behaviour. The Settings caption now explains
the trade-off.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
The user-driven captcha sheet routinely takes longer than 12s to
complete (page load + human interaction), but ProxyWaitReady was
hard-coded to a 12s deadline. The tunnel was killing itself mid-prompt
even after the WebView successfully returned a success_token.

When manual captcha mode is enabled, extend the DTLS readiness timeout
to 300s so the human has room to work. Auto mode keeps the original
budget so a stuck auto-solver still fails fast.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
The user-visible problem: after ~5 minutes the VK TURN allocation expires
on the server, but iOS / pion-turn / the proxy loop don't notice fast
enough. The UI still says Connected while WG packets fall into a black
hole because the inner DTLS channel is dead and nobody told anyone.

Six fixes, end-to-end:

1. **Active watchdog** (`turn_proxy.go`):
   Every 15s `oneDtlsConnection` checks when it last read a byte from
   `dtlsConn`. If silence > 60s it cancels `dtlsctx`, which causes the
   outer loop to spin a fresh DTLS connection on the same listenConn
   with cached TURN creds (no captcha).

2. **DTLS keepalive**: WireGuard already does this via
   `PersistentKeepalive=25`, so no extra writes from our side — the
   watchdog is what makes those keepalives actually count.

3. **`poolCreds` cache stays warm** (`turn_proxy.go`):
   `cTime` is now refreshed on every reuse, not just when the pool
   grows. A long-lived session can no longer evict its own credentials
   mid-flight and force a fresh captcha on the next reconnect.

4. **`RestartProxy` cgo export** (`turn_restart.go`, `wireguard.h`):
   New global registry of active `dtlsCancel` funcs; `RestartProxy()`
   iterates and cancels them all. Swift uses this from
   `NEPacketTunnelProvider.wake()`.

5. **`NWPathMonitor` in the extension** (`PacketTunnelProvider.swift`):
   Started after the WG adapter is up. On every path change after the
   initial one we call `restartTransport(...)`, which is debounced to
   5s so a wake + path-change burst only restarts once.

6. **Transport-health banner**:
   - `network-extension/TransportHealthMonitor.swift` watches the Go
     log stream from the existing logger callback and writes
     `transport.lastAliveAt` / `lastDeadAt` into the App Group
     UserDefaults.
   - `TurnBridge/TransportHealthBanner.swift` + `ContentView` poll
     every 5s and surface an orange "Connection unstable" strip when
     iOS still says NEVPNStatus=.connected but the TURN side has been
     silent or failed within the last 30s.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
LTE was firing `NWPath update: status=satisfied, via=cellular` every
~20s without actually changing anything (PDP-context refresh / tower
handover). The first build of this PR restarted DTLS on every one of
those, producing a ~1s outage every 20s on cellular.

Only restart when something observable changed:
- interface kind flipped (wifi ↔ cellular ↔ ethernet), or
- path was previously unsatisfied and is now satisfied.

Pure noise events are dropped silently. The 60s DTLS watchdog still
catches real transport death, so we don't lose the safety net.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
The previous archive job (60c6cba) failed in ~58s, which is consistent
with xcodebuild bailing during Swift compilation of
PacketTunnelProvider.swift before reaching link/archive. The only file
that changed in that commit was the NWPath handler. Without log access
I can't pin the exact diagnostic, so cover the most likely culprits
defensively:

- Replace `\(NWPath.Status)` interpolation with an explicit
  `describe(_:)` helper. Avoids relying on the type's auto-derived
  `description` (NWPath.Status is not declared CustomStringConvertible).
- Fully qualify case shorthand as `NWPath.Status.satisfied` in
  comparisons so type inference can't fail.
- Drop the Unicode `→` from a string literal — use ASCII `->`.
- Hoist `prevLabel ?? "?"` once so optionals don't appear in
  interpolation sites.

No behavior change.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
iOS 26.4 SDK exposes two `NWPath` types in scope when the file imports
both `Network` and `NetworkExtension`:

- Network.NWPath (Swift struct)
- NetworkExtension.NWPath (Obj-C class from NetworkExtension/NWPath.h)

Swift refused to compile `NWPath.Status` because the bare name is now
ambiguous. Fully qualify every type reference as `Network.NWPath.Status`.
The string literals `"NWPath ..."` are just log text, no change there.

Resolves: error: 'NWPath' is ambiguous for type lookup in this context

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Adds structured log lines on every session teardown so we can see how
long a DTLS / TURN session lived and what error tore it down. Pairs
with server-side journal logs (vk-turn-proxy@udp) to diagnose
reconnect cadence.

Sample:
  DTLS session lifetime=42.137s exit=<nil>
  TURN session lifetime=27.901s exit=failed to allocate: STUN timeout

Part of P3 in the connectivity-stability series.
oneDtlsConnectionLoop and oneTurnConnectionLoop previously retried
immediately on failure (TURN loop was throttled only by a 200ms tick).
On VK API rate-limits or a flapping cellular link this caused tight
retry storms that drain battery and never recover.

Now both loops use a capped exponential backoff (500ms -> 30s) with
+/- 25% jitter so N parallel streams desynchronise. Backoff resets
to 0 on a successful session, so steady-state reconnects after a
quick blip are still near-instant.

Part of P4 in the connectivity-stability series.
StartProxy used to set port = "19302" by default, and oneTurnConnection
would then override whatever VK API responded with that hardcoded
value. If VK ever moves their TURN endpoints to a different port, the
client breaks silently without a useful error.

The VK API response (turn_server.urls[0]) already carries the correct
host:port; we just trust it now. host/port stay as overrides for users
who explicitly want to pin an endpoint via -turn-style flag in the
future.

Part of P6 in the connectivity-stability series.
This is the highest-impact fix in the connectivity-stability series.

Problem
-------
PacketTunnelProvider's sleep() and wake() overrides were empty. When
iOS suspends the Network Extension (screen lock, background, low
memory), the Go runtime hosting the embedded TURN+DTLS proxy is
frozen. By the time the device wakes up:

  - VK TURN has dropped the idle channel/allocation.
  - Cellular NAT mapping for our UDP socket is gone.
  - pion/dtls sequence numbers are outside the replay window.

WireGuard immediately starts pumping packets through these zombie
sockets, pion/dtls eventually times out and sends close_notify (seen
on the server as 'Failed: EOF'), and only THEN does the client tear
down and rebuild. That's the multi-minute 'nothing loads' gap.

Fix
---
Go side:
  - Maintain a registry of live DTLS/TURN session cancel-funcs.
  - Expose //export ProxyForceReconnect() that cancels every live
    session at once; the existing oneDtlsConnectionLoop /
    oneTurnConnectionLoop pick the cancellation up and rebuild on
    the standard backoff path.

Swift side:
  - sleep() records the suspension timestamp.
  - wake() logs the suspension gap and calls ProxyForceReconnect()
    BEFORE WireGuard tries to use the tunnel, so the new allocation
    is in place before the first user packet hits it.

Part of P1 in the connectivity-stability series.
Upstream turnbridge hardcoded `udp := true` inside StartProxy. UDP is
faster but silently loses state on cellular blips and after iOS
suspends; TCP-via-STUNConn survives short outages via TCP buffers and
retransmits. The CLI vk-turn-proxy client exposes -udp specifically
for this reason.

This commit:
- StartProxy gains a cUDP C.int parameter
- PacketTunnelProvider reads "useUDP" from providerConfiguration
- VPNProfile gains a useUDP field (with backward-compat Codable for
  pre-existing UserDefaults profiles — default true)
- TurnConfigImport supports optional `udp` field in the JSON
- quick_link.py example shows the new field
- TurnBridgeApp.turnOnTunnel takes useUDP and propagates to provider
- ProfileStore legacy migrate path passes useUDP: true

User-visible behavior unchanged when the field is absent or true.
Setting it to false in the JSON / clipboard import enables TCP path
for testing on bad networks.

Part of P2 in the connectivity-stability series.
Send a 4-byte sentinel (0xFF 0xFF 0xFF 0xFF) over the DTLS connection
every 5 seconds, regardless of WireGuard activity, to keep the VK
TURN ChannelData binding fresh and prevent the relay from dropping
'idle' channels.

The sentinel is intentionally invalid as a WireGuard message (first
byte must be 0x01-0x04 and length >= 32 bytes for WG transport data),
so the server-side vk-turn-proxy can silently drop these packets in
its read loop before they reach 127.0.0.1:51820. WireGuard itself
would also reject them with a checksum/format failure, just less
cheaply.

This complements WG's own PersistentKeepalive=25 by working even when
WG's goroutine is throttled by iOS (e.g. CPU pressure short of full
suspend, where wake() doesn't fire but ticks are missed).

Companion server-side patch needed in truvvor/vk-turn-proxy to
explicitly filter the sentinel — see follow-up PR there.

Part of P5 in the connectivity-stability series.
claude and others added 30 commits May 16, 2026 08:38
User pushed back on the maxN=40 cap — wants real memory savings
rather than a hard limit. Profiling the per-buffer cost:

  fanoutQueueDepth × N_sessions × ~1.5 KB per packet
  = 256 × 40 × 1.5 KB ≈ 15 MB worst-case

The 256 was set to absorb a "page-load burst" — but real bursts are
50-100 packets in a few ms (TCP slow-start + parallel HTTP); 64
covers that comfortably. Worst-case footprint drops to ~4 MB. If
the consumer is genuinely slower than the producer for >64 packets,
the drop counter trips earlier — that's the correct signal anyway;
hiding it behind a deeper queue just delays the inevitable WG
retransmit.

Same code path as the shortest-queue-first dispatcher (F1.3.1) —
that change naturally avoids slow lanes, which means most lanes
should be near-empty most of the time, so 64 is well above the
typical occupancy.

This is the first of several memory-reduction passes coming from
the architect-agent audit kicked off in parallel. The full
refactor (lighter pion/turn replacement, shared dispatcher across
sessions instead of per-session goroutine trees) is more invasive
and tracked separately.

Bumps MARKETING_VERSION 1.3.6 → 1.3.7 for a fresh TestFlight upload
bucket.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Drops github.com/cbeuw/connutil.AsyncPacketPipe (unbounded bytes.Buffer
whose capacity ratchets to high-water mark and never shrinks) for a
hand-rolled 16-slot channel pair per direction. Worst-case in-flight
memory per DTLS session is now 2 × 16 × ~MTU instead of "whatever a
reconnect-storm burst peaked at". Overflow drops with UDP semantics
and trips a per-pair counter — same shape as udp_fanout.go.

Adds readBufPool, a sync.Pool of *[1600]byte. The four per-session
read-loop goroutines in oneDtlsConnection / oneTurnConnection now
borrow/return instead of allocating fresh 1.6 KB scratches on every
goroutine startup. Under reconnect churn (N=40 sessions cycling on
cred rotation) that's ~256 KB/min of avoidable GC pressure removed.
Hand-rolled minimal TURN client (RFC 5766) replaces pion/turn/v5 in
the data path. pion runs ~4-5 goroutines per allocation and keeps
multi-peer permission/channel maps. Our use case is one allocation,
one peer (the WG client through DTLS), one bound channel — a much
narrower shape than pion is built for.

What the new client does, in ~430 LOC:
  - Allocate with long-term auth two-pass nonce challenge
  - ChannelBind for the single peer (implicit permission per §11.2)
  - Refresh on a half-lifetime ticker
  - ChannelData frame encode/decode in the hot path
  - net.PacketConn surface matching pion's relayConn drop-in

What it skips: CreatePermission (channel-bind installs it), Send/Data
indications (we always use the bound channel), multi-peer, TURN-TCP.

Wired behind a const useMinimalTURN=true at the oneTurnConnection
allocation site. The pion path is preserved in the else branch as a
safety net while we wait for production data.

With per-session memory cost down (bounded pipe + pooled scratches +
fewer pion goroutines), the N=40 hard clamp is too conservative.
Raised to 100 so the user's setting actually takes effect; if iOS
SIGKILLs us we'll see it and re-cap with data instead of guesses.
1.3.9 shipped useMinimalTURN=true but the long-term-auth handshake in
turn_min.go produces a request the server rejects with 401 every time —
the first anonymous Allocate triggers a proper 401+REALM+NONCE
challenge, we learn the auth, but the authenticated retry is also
rejected. Every TURN allocate dies, DTLS handshakes time out at 30 s,
sessions retry-storm, RSS climbs through the iOS budget and the
extension crashes.

Reverts to pion/turn for now. Bounded pipe + readBufPool stay in
(those were measured wins from 1.3.8 unrelated to the broken path),
and turn_min.go stays on disk so we can fix forward without
recreating the file.

Also drops maxN back to 40: with the minimal client out, we're back on
pion's per-session goroutine zoo and N=100 needs the wins this commit
just reverted. Once turn_min.go is fixed, we can raise it again.
…mory

Three real, measurable memory wins that don't depend on the still-
broken minimal TURN client:

1. debug.SetMemoryLimit(75 MB) — Go runtime fires GC aggressively as
   the heap approaches this. iOS extensions have ~100 MB total; ~25 MB
   is C heap (pion-dtls crypto via cgo, WG core, the 4 MB kernel
   socket buffers we tuned). Better to GC than get SIGKILL'd.

2. debug.SetGCPercent(50) — halves the default 100, so heap grows
   1.5x between cycles instead of 2x. Pairs with the limit: limit
   handles the cap, percent handles steady-state drift.

3. debug.FreeOSMemory() every 15s — Go normally hands memory back to
   the OS lazily. On iOS what matters is RSS, not Go's heap view.
   Forcing release keeps us further from the SIGKILL line. The STW
   pause at this heap size is sub-ms.

Also lowered maxConcurrentCaptchaSolves 5 → 3. Each solve transiently
holds an HTTP/TLS client + JSON state + ~50 net/http goroutines worth
~1.5-2 MB. Under a reconnect storm 5 was adding ~10 MB peak; 3 keeps
almost all the throughput (VK's per-IP rate-limit is the binding
constraint, not our concurrency) for ~6 MB lower peak.
…HMAC

The 1.3.9 ship-blocker. m.SetType writes the message type at Raw[0:2]
and m.NewTransactionID writes the ID at Raw[8:20] — but nothing writes
the STUN magic cookie 0x2112A442 at Raw[4:8] until m.Encode runs at the
end. MessageIntegrity.AddTo, called by addAuth before Encode, was
computing HMAC over a message with cookie=0. Wire bytes go out with
the real cookie (Encode writes it), so the server recomputes HMAC over
different bytes than we signed → integrity mismatch → 401 every time.

Fixed by calling m.WriteHeader() right after NewTransactionID() in all
three builders (allocate, channelBind, refresh). WriteHeader stamps
type + length + cookie + tx ID, so HMAC and wire bytes now agree.

Re-enabled useMinimalTURN = true in oneTurnConnection. Raised maxN
40 → 60 — minimal TURN drops pion's per-allocation goroutine zoo, the
runtime tunings from 1.3.11 (SetMemoryLimit, GCPercent, FreeOSMemory,
captcha-client idle-conn trim) keep heap tight, so 60 should fit
comfortably. Will raise further once 60 is field-validated.
At N=60 the fanout dispatcher round-robins WG packets across all 60
sessions. With one WG keepalive every 25 s, the average gap before any
single session sees inbound is ~25 min — but the watchdog was killing
sessions after just 60 s of silence. Result in the 1.3.12 field log:
the 1-2 sessions WG happened to route through stayed alive, the other
58 got watchdog'd at T+60s, the reconnect loop fired captcha solves
to replace them, VK ERROR_LIMIT'd the IP, replacements failed, repeat.
Memory was healthy (rss ~72 MB stable). Throughput wasn't.

Fix: only arm the watchdog after the session has actually carried
inbound data. lastRxNanos starts at 0 instead of time.Now(); the dtls
read loop bumps it on first successful Read. Sessions that never
receive sit indefinitely — minimal TURN's half-lifetime Refresh keeps
the allocation alive on the server side, no captcha re-burn needed.

The watchdog's intended catch (TURN allocation silently dies mid-flow)
still works: those sessions DID receive data first, so lastRxNanos
is non-zero, so the 60 s rule applies.
Field log showed StartProxy dying immediately with "Resolve UDP error:
lookup udp/56010 : unknown port" — a transient iOS NetworkExtension
sandbox race. The address is a literal IPv4 with a numeric port
("77.90.8.199:56010"), so Go's net.ResolveUDPAddr has nothing to
resolve. But the call still dives through getaddrinfo via cgo, and
during the first ~100 ms of extension startup that path can return
nonsense — Go then formats it as "unknown port" because its resolver
got an answer back from getservbyname() it doesn't know how to handle.

Bypass: parse with net.SplitHostPort + net.ParseIP + strconv.Atoi.
No syscalls, no resolver, no race. Falls back to a clear error
message if the address ever isn't a literal IP, which would be a
config bug we'd want surfaced anyway.
1.3.14 surfaced the real bug behind the "unknown port" startup error:
the saved profile had a U+2009 THIN SPACE wedged after the port —
SettingsView shows it correctly because thin space is invisible to
the eye, but the Go side gets "56010 " and strconv.Atoi
rejects it.

Belt-and-suspenders fix:
  1. Go side (parseLiteralUDPAddr): strings.TrimSpace before parsing
     handles every Unicode whitespace including U+2009.
  2. Swift side (turnOnTunnel): trim peerAddr/listenAddr with
     CharacterSet.whitespacesAndNewlines before sending across the
     XPC boundary. Profile gets cleaned on every connect, so the
     stale thin space disappears without the user re-typing.

Both layers strip independently so neither one is the only defense.
VK's anti-bot started fingerprinting captcha requests by JA3/JA4 (TLS
ClientHello shape), HTTP/2 SETTINGS, and header order. Go's stdlib
net/http has a stable but distinctively non-Safari fingerprint — VK's
classifier now flags us as a non-browser and returns ERROR_LIMIT on
captcha solves even when our UA + cookies + behavioral signals are
clean.

bogdanfinn/tls-client (utls under the hood) sends TLS handshakes byte-
identical to Safari iOS 18.0, including ALPN, cipher suites, supported
groups, extensions, and key shares. Paired with bogdanfinn/fhttp's
HeaderOrderKey + PHeaderOrderKey, the entire request shape matches
mobile Safari. Pure Go, no cgo, runs unchanged inside the iOS NE
sandbox.

Changes:
  - captcha_client.go (new): newTLSCaptchaClient builds a tls-client
    HttpClient with Safari_IOS_18_0 profile + fresh cookie jar +
    HTTP/3 disabled (HTTP/2 matches mobile Safari traffic).
  - vk_captcha.go: newCaptchaClient now returns tlsclient.HttpClient;
    fetchPowInput and the inner vkReq closure use fhttp.NewRequest
    and applySafariHeaderOrder. Dropped sec-ch-ua* — Safari doesn't
    emit them and they were a classifier tell on the old path.

Limitation: tls-client's WithDialer takes a net.Dialer struct, not a
DialContext callback, so customDial / cellularDial / DoH fallback
(dns_resolver.go) don't apply to the captcha HTTP path anymore. iOS
system resolver handles api.vk.ru / id.vk.ru in practice; DoH was a
paranoia-fallback for carriers that NXDOMAIN vk-family hosts and the
field log has never triggered it. If that path becomes necessary,
pre-resolve via DoH + dial-by-IP + WithServerNameOverwrite.
Two wins on the server-side path that mirror the iOS-side Phase 1
(1.3.16) and add server-only Phase 3 (WARP):

**TLS+HTTP/2 fingerprint impersonation (Phase 1 mirror).** Server-side
captcha-service hits the same VK anti-bot wall that the iOS side did,
so port the same fix: bogdanfinn/tls-client with Safari_IOS_18_0
profile, fhttp with header-order + pseudo-header-order matching mobile
Safari. captcha_client.go is a thin wrapper; vk_captcha.go switches
from *http.Client / http.NewRequestWithContext to tlsclient.HttpClient
/ fhttp.NewRequest. Dropped sec-ch-ua* (Safari iOS doesn't emit them
and they were a classifier tell on the old path).

**WARP egress (Phase 3).** New warp_dialer.go layers SO_BINDTODEVICE
control on outbound captcha sockets when WARP_INTERFACE env var
points at a pre-existing WireGuard interface (typically Cloudflare
WARP via wgcf). Three call sites picked up the hook:

  - captcha_client.go's tls-client uses WithDialer(newWARPNetDialer())
    so VK API + bootstrap HTML egress via WARP.
  - dns_resolver.go's customDial constructs net.Dialer with
    Control: warpControl across all three dial paths (literal IP,
    system resolver, DoH/fallback IPs), so creds.go's sharedAuthClient
    (VK token POST) and dohClient (DoH lookups to 1.1.1.1) both
    egress via WARP too.
  - dohClient transport gets the same control hook directly.

Approach is to NOT manage the WireGuard interface in-process — wg-quick
handles keys, registration, MTU, lifecycle externally. captcha-service
just consumes an interface name. Unsetting WARP_INTERFACE makes
warpControl a no-op (egress reverts to host default route), no code
change needed. README documents the operator setup including the
required CAP_NET_RAW and Table=off in the wgcf config.
…ype routing

Ports the higher-impact pieces of Moroka8's captcha v2 implementation
into both the iOS NetworkExtension and the server-side captcha-service.

**Dynamic debug_info** (biggest single fingerprint win). The pre-v2
code pasted a hard-coded SHA-256 ("e3b0c44…") into every
captchaNotRobot.check call. VK ships a fresh not_robot_captcha.js on
its own cadence; whenever they do, every solve from us silently
starts failing with status=BOT until someone notices and updates the
constant. captcha_debug_info.go now extracts the script URL from the
bootstrap HTML, fetches the JS, regex-extracts the per-version
debug_info hash, and caches it in a sync.Map keyed by script URL.
The legacy constant remains as a graceful fallback when the fetch
itself fails.

**Slim device shape**. The pre-v2 device map randomised screenWidth /
screenHeight / cores / RAM / devicePixelRatio / connectionRtt /
connectionDownlink per solve. That per-solve fingerprint churn was
correlated against the stable TLS fingerprint and flagged. v2 sends
the same desktop Chrome 1920x1080 / 8-core / pixelratio=1 shape every
time, dropping deviceMemory / connection* fields entirely. The noise
that VK actually cares about (browser_fp, cursor jitter) is still
random; the noise it doesn't care about is now fixed.

**crypto/rand browser_fp**. Was math/rand-seeded; now 16 bytes from
crypto/rand. Same length, stronger source.

**Empty motion arrays in check body**. Replaced the populated
connectionDownlink array (e.g. "[8.4,8.4,8.4,…]") with "[]" to match
v2 wire shape; the populated version was a low-signal noise generator
that wasn't fooling anyone.

**show-type routing on check response**. Previously any non-OK check
status fell through unconditionally to the slider path, including
ERROR_LIMIT (which guarantees the slider also rate-limits and burns
the source IP harder). Now we route on (status, show_captcha_type):
  - OK → return token
  - ERROR_LIMIT → mark saturated, surface error (no slider burn)
  - BOT with show_type="" or "slider" → try slider
  - anything else → surface error

This also fixed an iOS-side bug where solveSliderCaptcha got an
unrelated `htmlSettings` map that was different from the one the
check call used.

The captcha-service mirror picks up the same v2 changes so the
remote-captcha path returns consistent fingerprints whether the iOS
side solves locally or via /cred.
The piece I missed in the earlier Moroka8 port. Goal: defeat VK's
relay-side DPI that distinguishes our DTLS-over-TURN payload from
real WebRTC call traffic.

What VK's classifier sees on a real call: DTLS handshake (type 0x16
records), then SRTP frames carrying Opus voice — RTP header version=2,
payload type 111, monotonic seq/timestamp/SSRC, AEAD ciphertext.
What it sees on us pre-wrap: DTLS handshake, then DTLS application-
data records (type 0x17) forever. The two diverge sharply after
handshake and we've been observing the resulting rate-limit.

Verbatim port from Moroka8/vk-turn-proxy/pkg/clientcore/wrap.go:
  - wrap.go: ChaCha20-Poly1305 AEAD, wire shape
    [12B RTP hdr | 12B nonce | ciphertext | 16B tag]. Per-conn
    random SSRC/sessionID/seq/timestamp/counter init; AAD covers
    the RTP header + nonce so any byte tampering AEAD-fails.
  - wrap_config.go + wireguard.h: TurnBridgeSetWrapKey /
    TurnBridgeGenerateWrapKey C exports. Empty key = wrap off,
    bypass to legacy direct conn2↔relayConn path.
  - turn_proxy.go oneTurnConnection: build a per-session wrapConn
    on entry (each session needs disjoint SSRC/seq so VK can't
    coalesce N sessions into "one weird call leg"); inject
    wrapInto on conn2→relay and unwrapPacket on relay→conn2.
    AEAD failure on unwrap drops the single packet without
    tearing down the session.

Swift wiring:
  - VPNProfile.wrapKey field with backwards-compat CodingKeys.
  - SettingsView SRTP-Wrap section with a 64-hex TextField and a
    "Generate new key" button calling TurnBridgeGenerateWrapKey.
  - TurnBridgeApp.turnOnTunnel + PacketTunnelProvider plumb
    wrapKey through providerConfiguration into
    TurnBridgeSetWrapKey, called BEFORE StartProxy.

Server-side: requires the matching vk-turn-proxy server running with
-wrap -wrap-key=<hex>. Without that, every wrapped packet AEAD-fails
on the server side and no traffic flows — UI footer in Settings
documents this explicitly.
The Go-side TurnBridgeGenerateWrapKey C export isn't visible from the
main TurnBridge app target — Go is only linked into the
NetworkExtension. CI archive failed with "cannot find
'TurnBridgeGenerateWrapKey' in scope" because SettingsView is in the
main target.

Swap the Settings UI button to generate the key locally with
SecRandomCopyBytes from Security.framework. Same 32 bytes of kernel
CSPRNG entropy, same hex encoding, no cross-target linkage needed.
Dropped the unused TurnBridgeGenerateWrapKey declaration from
wireguard.h (the Go-side function is still kept in wrap_config.go in
case some future tooling wants it from the extension target).

Also adds docs/SERVER_COMPAT.md — an exhaustive spec for the engineer
patching the server side (Moroka8/vk-turn-proxy) to add WRAP support:
wire format, key+nonce layout, isServer direction bit, encrypt/decrypt
pseudocode, error handling rules, auto-detect for backwards compat
with un-wrapped clients, and a step-by-step rollout checklist.
The Settings view holds edits in a local `draft: VPNProfile?` and
only flushes to the store on .onDisappear. The TextField binding
reads/writes `draft`; my Generate button was writing directly to
`store.profiles[idx].wrapKey`, which the still-empty draft then
shadowed via `profile` (computed as `draft ?? store.…`) — TextField
stayed blank and the press appeared to do nothing.

Use the existing `binding(\.wrapKey).wrappedValue = newKey` setter so
the new key follows the same path as any keystroke. .onDisappear
then persists it just like any other edit.
Adds a third captcha mode alongside the existing off/forced toggle:

  0  off       — auto solver only; recycle on failure (today's default)
  1  forced    — every captcha to the UI immediately (today's "manual")
  2  fallback  — auto first, manual prompt ONLY when auto+cluster fail

Rationale: at N=60 the iOS solver + remote /cred cluster typically
land ~80-85% of identities cleanly, and the remaining 15-20% recycle
a stale identity. Fallback mode escalates THOSE specific failures to
a Safari sheet so the user gets a fresh identity for every session
without being prompted for the 80% the solver already won.

Implementation:
  - captcha_manual.go: replace bool enabled with int mode + two
    accessors (manualCaptchaForcedMode for the entry-of-solve check,
    manualCaptchaFallbackAvailable for the tail-of-solve check).
    Backwards-compatible: passing 1 still means forced.
  - vk_captcha.go: at the end of solveVkCaptcha, after the auto chain
    returns an error, consult manualCaptchaFallbackAvailable and
    requestManualCaptcha as a last-resort escalation. Successful
    manual solve credits the same captcha counter as auto would.
  - ManualCaptchaSetting.swift: CaptchaMode enum, mode getter/setter
    in UserDefaults under the new "manualCaptchaMode" key. Old
    "manualCaptcha" bool is read as a fallback so older installs
    don't lose their forced-vs-off preference, and written in
    parallel so older NetworkExtension builds (if any are still
    around) keep working.
  - GlobalSettingsView.swift: replace the on/off Toggle with a
    3-option Picker plus a footer that explains the tradeoff for
    each mode.
  - PacketTunnelProvider.swift: read the new int key, fall back to
    legacy bool, pass through to TurnBridgeSetManualCaptchaMode.
PacketTunnelProvider had three lines still referencing the local
manualCaptchaEnabled bool that 1.3.21 removed — Swift can't find
the symbol so archive failed. They lived in the DTLS-ready-budget
calculation (the "give 30 s per session instead of 15 s when the
user is solving manually" branch) and the log line right below it.

Replace with a userPromptPossible check that's true for both forced
and fallback modes — even in fallback we need to give the user
wall-clock time for the small minority of sessions that do prompt.
Log line now uses the human-readable captchaModeLabel built earlier.

Also fix the legacy "manualCaptcha" bool that ManualCaptchaSetting
keeps in sync for routing-scope decisions. Was writing
`newValue == .forced` which left fallback mode without the kill-
switch relaxation needed for the Safari sheet to load. Now writes
`newValue != .off` so both forced and fallback get routing room.
User reports VK rejecting manually-solved captchas — both the auto
path and the user-clicked path fail. Root cause investigation found
three WKWebView fingerprint tells we were leaking:

1. **HTTP User-Agent missing "Version/X Safari/604.1" suffix.**
   WKWebView's default UA stops at `Mobile/15E148`, while real Mobile
   Safari adds `Version/18.0 Mobile/15E148 Safari/604.1`. That diff
   is one of the cheapest classifiers VK runs. Fix: set
   `webView.customUserAgent` to the full Safari iOS 18 UA so every
   HTTP request from the sheet carries the correct string.

2. **JS-side navigator.userAgent didn't match the HTTP UA.** Setting
   `customUserAgent` only patches outbound requests; `navigator.userAgent`
   in JS still returned the truncated WKWebView default, so VK's
   bootstrap script saw a UA mismatch (HTTP says Safari 18, JS says
   raw WebKit) — itself a fingerprint signal. Fix: inject a
   document-start script that defines `navigator.userAgent`,
   `navigator.appVersion`, `navigator.vendor`, `navigator.platform`,
   and `navigator.languages` to the Mobile Safari shape, strips
   `navigator.userAgentData` (real Safari iOS doesn't emit Client
   Hints), and removes the `webdriver` property entirely.

3. **websiteDataStore = .nonPersistent()** — every captcha sheet got
   a clean-room browser with zero vk.com cookies / localStorage /
   IndexedDB. Real users have weeks of vk.com state; a freshly-spawned
   browser environment is itself a bot tell. Fix: use
   .default() so vk.com session state persists across captcha
   sheets within the app. Different from the system Safari data store
   (that requires ASWebAuthenticationSession), but app-scoped
   persistence is enough for the classifier.

The third remaining concern — that VK binds success_token to the
exact browser session that solved it, so even a successful WKWebView
solve fails when the token is later redeemed by Go's HTTP client —
is unaddressed here. If 1.3.23 still shows BOT/ERROR after the user
solves manually, the next step is to make the WKWebView do the
follow-up vchat.joinConversationByLink call itself (so token mint and
redemption happen in the same session) and pass the final TURN
credentials back instead of just the token.
The defensive layer against VK upgrading their classifier. Even with
the 1.3.23 WKWebView fingerprint fixes (Safari UA spoofing, persistent
data store, navigator.* mocks), VK can still distinguish "captcha was
solved here, token redeemed over there" because the success_token is
minted to a specific browser session and we then use it from Go's
HTTP client — different cookies, different TLS, different connection.

This commit makes the WebView do the follow-up VK API call inside its
own session, so VK never sees a session switch.

Wire shape:

  getCreds (Go)        — knows the failing request (URL + body
                         template with __TOKEN__ placeholder).
                         Stashes them in the manual captcha slot,
                         then asks Swift to solve.
  CaptchaBridge        — on publishRequest, calls
                         TurnBridgeGetManualCaptchaRetryRequest to
                         pull the retry params out of the Go slot,
                         packs them into PendingRequest alongside
                         the redirect_uri.
  CaptchaManager       — picks up PendingRequest, opens
                         CaptchaWebView with the retry params.
  CaptchaWebView (JS)  — handleSuccessToken: when the helper detects
                         success_token, instead of immediately
                         sending it native-side, fetch the retry URL
                         with retryBody (token swapped in) inside the
                         WebView's session. On 2xx, send the JSON
                         response as 'final_response'. On failure,
                         fall back to sending the raw token.
  CaptchaManager.submit(response:) — new API for the response path,
                         routes through CaptchaIPC.AppMessage.
                         responseJson.
  CaptchaBridge        — on captcha_answer, prefers responseJson
                         (calls new TurnBridgeSubmitManualCaptcha-
                         Response export), falls back to legacy
                         TurnBridgeSubmitManualCaptchaToken when only
                         a token came through.
  solveVkCaptcha (Go)  — return shape extended to (token, response,
                         err); response non-empty when the WebView
                         did the replay. Both callsites (forced and
                         fallback modes) propagate retryURL/retryBody
                         through.
  getCreds (Go)        — on non-empty response, JSON-parses it and
                         splices into the retry loop's resp var,
                         bypassing the legacy "build retry body, do
                         doRequest" branch entirely. On empty
                         response, falls through to the legacy retry
                         (which still fires for backwards-compat with
                         older Swift bridges that don't ship the
                         response path).

ABI is preserved: the existing C callback signature for
TurnBridgeSetManualCaptchaCallback is unchanged. New exports
(TurnBridgeGetManualCaptchaRetryRequest,
TurnBridgeSubmitManualCaptchaResponse) extend the protocol; older
extension builds that don't link against them keep working in
token-only mode.

If VK later upgrades and starts rejecting in-WebView replays too
(e.g. by reading XHR body shape against a stricter origin policy),
the next layer would be to keep the WebView open across all three of
solve+replay+TURN allocation, but that's a much bigger architectural
shift and not warranted yet.
Three field-reported wedges:

1. **"Got response, finishing…" hangs forever.** CaptchaManager.submit
   was awaiting sendMessage BEFORE clearing pending — if the
   NETunnelProviderSession.sendProviderMessage IPC took its time
   (or hung), the sheet's binding stayed bound and the user had to
   kill the app. Fix: snapshot the message, clear pending first
   (sheet dismisses immediately), then fire IPC. The success/cancel
   message is always delivered, just async from the dismiss.

2. **"Attempt limit reached" leaves the user stranded.** VK renders
   that page server-side as plain HTML — no XHR for our fetch/XHR
   hooks to catch — so the only way to know is DOM polling. Added
   a 750 ms interval that scans document.body.innerText for both
   English and Russian variants of "attempt limit reached / try
   again later / попыток исчерпан". On match fires a new
   'terminal' message → CaptchaWebView dismisses with cancel
   (Go's requestManualCaptcha unblocks immediately instead of
   waiting its 180 s timeout).

3. **No Swift-side watchdog.** Go's requestManualCaptcha times out
   at 180 s but the UI didn't know. If the user walks away and
   neither solve nor terminal triggers, the sheet would persist
   past the backend's already-given-up state. Added a 175 s
   onAppear watchdog that auto-cancels — 5 s before Go's timeout
   to avoid a race where Go bails first and the UI's cancel
   targets an already-stale request.
User reports stuck sheets after several successful manual solves —
the actual cause is that we keep firing the prompt for every session
in a pool of 60, regardless of how many times the user has already
helped. Even in fallback mode this happens any time auto + cluster
both fail on a single identity. Users abandon around prompt 5-10 and
the un-dismissed sheets look like a bug.

Hard cap: 5 manual prompts per StartProxy session. Past that:

- forced mode → manualCaptchaForcedMode() returns false, auto chain
  takes over (and will likely fail for the same reasons the user
  was being prompted in the first place, leading to identity recycle)
- fallback mode → manualCaptchaFallbackAvailable() returns false,
  caller bails to identity recycle instead of escalating

Either way the user is no longer prompted past five. Identity pool
degrades to the recycle path for the remainder of the session, which
keeps the tunnel up with reused TURN creds — still better than the
"stuck sheet" experience.

Quota is per-StartProxy; resetManualCaptchaQuota() runs at the top of
StartProxy so each fresh connect attempt grants the full five again.
Counter is atomic so the in-flight check inside requestManualCaptcha
can't be raced by two solvers slipping past the gate at once.
User reports the captcha sheet still hangs after several solves
despite the 1.3.26 quota. Two real bugs found in the dismiss path:

1. **Cancel button gated by didFinish.** The Cancel toolbar item
   guarded on `!didFinish` and returned silently if didFinish was
   already true. That meant: if handleSuccessToken (or the
   onResponse callback) had fired but the subsequent IPC / sheet
   dismiss got wedged for any reason (NETunnelProviderSession busy,
   binding not propagating, downstream Go bail-out), the user's
   only escape — the Cancel button — also did nothing. Drop the
   guard. Cancel is now idempotent on both Swift and Go sides
   (Go's TurnBridgeCancelManualCaptcha handles missing requestID
   gracefully) so double-cancel doesn't break anything.

2. **XHR detection blind to addEventListener('load').** Our hook
   wraps `xhr.onreadystatechange` via property assignment. Code
   that subscribes via `xhr.addEventListener('load', ...)` instead
   fires its handlers without our wrapper ever running — we miss
   the success_token in the response and the sheet looks stuck.
   Adds a second hook in xhr.send that calls
   `xhr.addEventListener('load', ...)` to inspect responseText
   regardless of how the page wired its own handler. Wrapped in
   try/catch so an exception on either path doesn't poison the
   other.

If 1.3.27 still wedges in the wild, the next things to investigate
are: (a) the @StateObject vs ObservableObject quirk in TurnBridgeApp's
captchaManager binding (might not be propagating pending=nil to the
sheet item binding), and (b) detection via DOM mutation observers
for the green-checkmark state when VK no longer fires any XHR at all.
Field log from N=60 startup showed three requestManualCaptcha calls
firing within the same second. Each one wrote a PendingRequest into
the App Group UserDefaults under the same key, last write wins. The
Swift sheet kept swapping URLs under the user's fingers — what looked
like a "stuck on green check" was actually request N's page being
overwritten by request N+1's URL mid-flight.

Add a 1-buffered channel as a binary semaphore at the entry of
requestManualCaptcha. Only one caller at a time gets to publishRequest
and block on the answer; the rest wait in line. Quota check happens
AFTER acquiring the slot so we only burn quota for prompts the user
actually sees.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Distinct from the parallel 1.3.27 cancel-button fix — this carries
the manual-prompt serialisation change on top.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
User report: 1.3.28 still showed 5 sequential captcha sheets at
N=60 forced mode. Serialisation lined them up but didn't reduce
the count — and the auto-fallback after quota exhaustion lost at
slider step (status: ERROR), leaving 7/60 sessions usable.

User's hard ask: never more than ~3 manual captchas per StartProxy.
After that, everything goes to the remote captcha-service cluster.

Three coupled changes:

1. manualCaptchaQuotaPerSession 5 -> 3. The visible cap.
2. remoteHandoverThreshold 3 -> 1. As soon as ONE WG session is
   actually up (fanout active, DTLS handshake completed) any new
   getCredsRouted call prefers the server.
3. errDeferToRemote sentinel. Goroutines that committed to
   the local path at startup (when sessions_ready=0) re-check
   AFTER acquiring the manual-prompt serialise lock. If
   sessions_ready>=1 by then, they exit with errDeferToRemote
   instead of showing a sheet. getCredsRouted unwraps the
   sentinel and retries via getCredsRemote. Quota exhaustion
   takes the same path when remote is available — no more silent
   fallback to the auto solver that VK currently rate-limits at
   slider step.

manualCaptchaForcedMode() no longer consults quota — that became
single-source-of-truth inside requestManualCaptcha so the defer
and quota-exhaust paths both go through one decision point. Error
wrapping in turn_proxy.go switched from %v to %w so errors.Is
sees the sentinel through the call stack.

https://claude.ai/code/session_01TV73kDE6EgfVBEUBHbvYbY
Field log 1.3.27: tunnel stopped (reason 9 = configurationDisabled) 4s after the first captcha sheet appeared; the user kept solving captchas into a dead session for 20+ minutes because nothing dismissed the sheet. Now stopTunnel clears the pending request and posts the (previously declared but unused) cancel Darwin notification; the app observes it and drops the sheet.
fix(captcha): cap reconnect-driven retry storm + retry direct after cooldown
Under hard blocking the first VK identities must be hand-solved before any tunnel or captcha-service exists. This makes that path stop reading as a bot.

- bootstrap-manual-first: while captchaSessionsReady==0 (and prompts opted in), solveVkCaptcha skips the tls-client auto attempt and goes straight to the manual sheet. The auto attempt draws status:BOT and poisons the session/IP the user is about to solve in real WebKit. Falls through to auto on errDeferToRemote.
- cookie warm-up: load m.vk.com before the captcha so the persistent store has organic vk.com state instead of a cold session (3s hard cap).
- in-session replay logging: make the cross-origin getAnonymousToken replay branch explicit in logs (replay OK vs CORS/CSP FALLBACK) so we can tell solve-failure from redeem-failure on device.
- maxTouchPoints=5 parity; experimental ASWebAuthenticationSession real-Safari path (opt-in, not wired) + docs/MANUAL_CAPTCHA.md.
…stics)

Inject a logVerdict() that surfaces status (BOT/OK) + show_captcha_type on every captchaNotRobot.* response, deduped. In forced mode the auto chain is already skipped, so a failing manual solve must be diagnosed at the check step itself: BOT-at-check = fingerprint/dirty-IP rejection in-WebView, vs token-captured-but-replay-fell-back. Previously the sheet just hung to the watchdog with nothing logged.
Field log: after the first hand-solve succeeds, the next sheet renders an already-checked NotRobot box the user cannot re-solve — VK rate-limits success_token per source IP (~60s, their ERROR_LIMIT), and the shared cookie jar carries the solved state. The second prompt slipped past shouldDeferToRemoteNow() in the ~1s race before captchaSessionsReady ticked up.

Now requestManualCaptcha records the last successful manual solve and, within manualCaptchaPerIPCooldown (60s), refuses to show another sheet: defers to the remote captcha-service if configured, else fails fast so the caller reuses an identity. One hand-solve bootstraps a session; the server (its own IPs + WARP) earns the rest. Reset per StartProxy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants