diff --git a/src/content/docs/cloudflare-one/team-and-resources/devices/cloudflare-one-client/business-continuity.mdx b/src/content/docs/cloudflare-one/team-and-resources/devices/cloudflare-one-client/business-continuity.mdx new file mode 100644 index 00000000000..9049acb3ed4 --- /dev/null +++ b/src/content/docs/cloudflare-one/team-and-resources/devices/cloudflare-one-client/business-continuity.mdx @@ -0,0 +1,295 @@ +--- +pcx_content_type: solution-guide +title: Business Continuity Guide +description: Build a business continuity strategy for the Cloudflare One Client using available disconnection mechanisms and decision guidance for service degradation scenarios. +products: + - cloudflare-one +sidebar: + order: 10 +--- + +This guide helps you build business continuity strategies for the Cloudflare One Client by documenting available disconnection mechanisms and providing decision guidance for handling service degradation or infrastructure unavailability. + +## Current resilience posture + +The Cloudflare One Client operates on Cloudflare's globally distributed network with 300+ points of presence (PoPs) worldwide. Anycast routing automatically directs client connections to the nearest healthy PoP without manual intervention. The client maintains locally cached policies and continues enforcing security controls even when unable to reach Cloudflare's management systems. + +For detailed architecture information, refer to the [Cloudflare One Client documentation](/cloudflare-one/team-and-resources/devices/cloudflare-one-client/) and the [Cloudflare Network and Service Resilience Whitepaper](https://cf-assets.www.cloudflare.com/slt3lc6tev37/7ad0dpR3YyqxMlikPfbBgn/020b7450909f03ccf3c7dcfb0e99fc2e/Resilience_Whitepaper.pdf). + +## Fail-open vs. fail-closed decisions + +:::note + +**Critical decision framework** + +The Cloudflare One Client operates in **fail-closed mode by default**: if the client cannot reach Cloudflare, it remains connected and blocks traffic rather than failing open to unprotected Internet access. This protects your security posture but requires active decision-making during incidents. + +**When to fail open** (Cloudflare One Client stops trying to connect and allows network connectivity without Cloudflare One Client protections): + +- User productivity is critically impaired and business operations are at risk +- Emergency access to non-protected resources is required +- Forensic investigation requires raw traffic visibility + +**When to fail closed** (Cloudflare One Client blocks network connectivity until it can re-establish a tunnel to Cloudflare): + +- Cloudflare edge services are operational (traffic is processing normally) +- Only management dashboard is unavailable (policies continue enforcing) +- Regulatory/compliance requirements prohibit unfiltered Internet access +- Security incident requires maintaining visibility and control + +::: + +The mechanisms below help you execute fail-open decisions when needed. Document your decision criteria in advance and ensure appropriate stakeholders have authorization to trigger disconnection. + +## Customer impact and decision guidance + +
| Scenario | +Mechanism | +Guidance | +Prerequisites and limitations | +
|---|---|---|---|
|
+ Complete unavailability during Cloudflare infrastructure outage +Example: Cloudflare management systems unreachable; Global Disconnection unavailable but users need Internet access to maintain business operations. + |
+
+
+ A customer-hosted HTTPS endpoint that clients poll for disconnect signals, operating independently of Cloudflare infrastructure. + |
+
+ Use when: Cloudflare's management systems are unreachable but you need to disconnect clients to restore Internet access. +Guidance: Pre-configure this mechanism before outages occur. During an incident, update your endpoint to return Expected outcome: Clients disconnect within 1–2 polling intervals (configurable, default 60 seconds); users regain direct Internet access without security controls. + |
+
+ Prerequisites: +
Limitations: +
Security impact: Loss of all Zero Trust controls (same as Global Disconnection). + |
+
|
+ Complete unavailability of client connectivity +Example: Client cannot establish secure tunnel; users unable to access protected applications or filtered Internet. + |
+
+ Global Disconnection +Instantly disconnect all Cloudflare One Clients from the secure tunnel via Dashboard or API. + |
+
+ Use when: You need immediate fleet-wide disconnection and Cloudflare's management systems are reachable. +Guidance: Check the Cloudflare status page first. If Cloudflare infrastructure is experiencing issues, this mechanism may be unavailable — use External Emergency Disconnect instead. +Expected outcome: All clients disconnect within seconds; users have direct Internet access without filtering, threat protection, or private application connectivity. + |
+
+ Prerequisites: +
Limitations: +
Security impact: +
|
+
|
+ Individual device issue requiring immediate local override +Example: Single user locked out due to policy misconfiguration; client switch disabled but user needs emergency access. + |
+
+ Admin Override Codes +Time-limited, single-use codes allowing IT administrators to temporarily unlock client settings on a specific device. + |
+
+ Use when: An individual device requires immediate attention. This is the only option for iOS and Android users when External Emergency Disconnect is unavailable. +Guidance: Generate the code in the Dashboard, provide it to the user over a secure channel, and have the user enter it locally to temporarily bypass the locked switch. +Expected outcome: Temporary local override allowing the user to disconnect the client for one hour. + |
+
+ Prerequisites: +
Limitations: +
Security impact: Single device loses Zero Trust controls for one hour. + |
+
|
+ Degraded performance impacting user productivity +Example: High latency through client tunnel; intermittent connection drops affecting work quality. + |
+
+ Graduated response strategy +Use a combination of mechanisms based on scope and severity. Use Digital Experience (DEX) to determine scope and severity. + |
+
+ Guidance by scope: +
Decision factors: Balance user productivity needs against security requirements. For regulated industries, consult your compliance team before disconnecting. +Expected outcome: Restored user productivity with a documented security trade-off. + |
+
+ Prerequisites: +
Limitations: +
Security impact: Scope-dependent — refer to individual mechanism entries above. + |
+
|
+ Management dashboard unavailable, traffic processing normally +Example: Dashboard and API unreachable; edge services and client connections remain functional with cached policies. + |
+
+ No action required +Edge services continue operating using cached configurations. New configuration changes will be unavailable until management systems recover. + |
+
+ Use when: Cloudflare's management systems are unavailable but user traffic continues processing normally. +Guidance: Monitor the Cloudflare status page. No customer action is typically required — edge services enforce cached policies until management systems recover. +Expected outcome: Existing configuration continues to apply; configuration changes resume when management systems recover. + |
+
+ Prerequisites: +
Limitations: +
Security impact: None — security controls remain active. + |
+