Make proxy CA selectable via --proxy-ca flag#50
Open
Conversation
The Bitmaker MITM proxy presents TLS certs signed by its own CA (<BM Proxy CA>), which spiders did not trust by default, causing SSL: CERTIFICATE_VERIFY_FAILED when proxying HTTPS requests. This change ships the proxy CA cert as a package asset and the generated Dockerfile-estela now installs it into the system trust store via update-ca-certificates. REQUESTS_CA_BUNDLE / SSL_CERT_FILE / CURL_CA_BUNDLE env vars are set so requests, Twisted (Scrapy) and curl pick it up; Chromium uses the system NSS store. Validated end-to-end on hetzner-staging: a Scrapy spider going through the proxy now completes HTTPS requests successfully. Bumps version 0.3.3 -> 0.3.4.
Refactors the previous Bitmaker-specific bundling so the CLI is
useful to any user with their own MITM proxy, while keeping the
Bitmaker default for backwards-compatible behavior.
Changes:
- Move assets/bitmaker-proxy-ca.crt to assets/certs/bitmaker.crt;
the assets/certs/ folder now holds any number of bundled CAs.
- estela init learns a --proxy-ca <name|path|none> option:
* <name> -> assets/certs/<name>.crt (bundled)
* <path> -> any absolute or relative .crt/.pem file
* none -> skip CA installation entirely
Default: bitmaker (keeps current Bitmaker flow with no extra flag).
- The selected name is persisted as project.proxy_ca in estela.yaml.
- Generated Dockerfile-estela copies the cert as .estela/proxy-ca.crt
(fixed filename, decoupled from the original cert name) and only
emits the CA install block when proxy_ca != none.
- New `estela list certs` command lists bundled certs.
- setup.py package_data updated to include assets/certs/*.crt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds optional support for installing a custom CA certificate into the spider Docker image generated by
estela init. This enables spiders to trust internal/private CAs — useful for users behind corporate TLS-inspecting proxies, MITM debugging tools, or air-gapped environments with private PKIs.Motivation
When a spider runs behind a TLS-intercepting proxy (e.g.,
mitmproxy, corporate egress proxy, internal CA-signed services), HTTPS requests fail withSSL: CERTIFICATE_VERIFY_FAILEDbecause the proxy's CA isn't in the system trust store. Today, the only workaround is editing the generatedDockerfile-estelaby hand after everyestela init, which doesn't survive regeneration. This PR makes the CA injection a first-class, declarative option.Design
assets/certs/folder can hold any number of bundled CA certsestela init --proxy-ca <name|path|none>:<name>→ resolves toassets/certs/<name>.crt(bundled with the CLI)<path>(.crt/.pem, contains/, starts with.) → any local filenone→ no CA installed (current behavior, no Dockerfile changes)estela.yamlasproject.proxy_caso subsequentestela initcalls or other tooling can read itupdate-ca-certificatesand exportsREQUESTS_CA_BUNDLE/SSL_CERT_FILE/CURL_CA_BUNDLEsorequests, Twisted (Scrapy),curl, and Chromium (via NSS) all pick it upproxy_ca = none, the generated Dockerfile is identical to today'sestela list certscommand for users to discover what's bundledsetup.pypackage_dataupdated to shipassets/certs/*.crtBackwards compatibility
The flag has a default value, so users who don't opt in get a Dockerfile equivalent to today's (no CA installed, no extra layers, no behavior change).
Coverage
DOCKERFILE)DOCKERFILE_SELENIUM)Verifications
resolve_proxy_ca_sourcecorrectly handles bundled names, absolute paths, relative paths,.pemextensions, andnoneestela list certslists bundled certs as expectedestela init --helpexposes--proxy-cawith the documented default and behaviorESTELA_PROXY_NAMEenv var picked up by the existingEstelaProxyMiddleware)docker buildof a generated Dockerfile —update-ca-certificatesregisters the cert and Python's default SSL context picks it up (one extra CA in the trust store, as expected)Notes
estela initre-run to regenerateDockerfile-estelaand pick up the new templatedocker build— reviewer welcome to spot-check