Skip to content

Add verify-drift subcommand for post-migration plan validation#236

Open
tamas-jozsa wants to merge 1 commit intomainfrom
drift-exemption-verification
Open

Add verify-drift subcommand for post-migration plan validation#236
tamas-jozsa wants to merge 1 commit intomainfrom
drift-exemption-verification

Conversation

@tamas-jozsa
Copy link
Collaborator

Adds tf-migrate verify-drift --file <plan> which reads a terraform plan produced after a v4→v5 migration, auto-detects the Cloudflare resources present, and checks every change against the bundled list of known migration drift exemptions.

  • New internal/verifydrift package: Verify(), groupExemptedLines(), parseExemptionTag(), PrintReport() with coloured output
  • Embedded exemption YAML files (exemptions/) synced from e2e/ at build time
  • Five exported functions added to internal/e2e-runner/drift.go: DetectResourcesFromPlan, CheckDriftWithConfig, ParseDriftExemptionsConfig, MergeAndCompileExemptions, LoadDriftExemptionsFromDir
  • MergeAndCompileExemptions uses stable ordered merge with implicit resource type scoping and a sort pass so constrained exemptions are checked first, fixing a map-iteration ordering bug that caused wrong rule attribution
  • make sync-exemptions target keeps embedded YAMLs in sync with e2e/ source; build and build-e2e both depend on it
  • Updated all e2e/ drift exemption descriptions to be customer-facing
  • Exit code 0 = clean, 1 = unexpected drift found (CI-friendly)
  • README documents the workflow, example output, and command flags
./bin/tf-migrate verify-drift --file v5-plan.log

Cloudflare Terraform Migration - Drift Verification
====================================================
Plan file:          v5-plan.log
Resources detected: access_rule, argo, certificate_pack, custom_hostname, load_balancer, logpush_job, ruleset, spectrum_application, tiered_cache, zero_trust_access_identity_provider, zero_trust_device_profiles, zero_trust_dlp_predefined_profile, zero_trust_gateway_policy, zero_trust_gateway_settings, zero_trust_local_fallback_domain, zero_trust_tunnel_cloudflared_config, zone_dnssec

✓ Exempted Changes  (17 rule(s) matched, 58 change(s))
────────────────────────────────────────────────────
  Rule:    html_entity_encoding
  Reason:  The v4 and v5 providers encode special characters differently in string fields. The v4 provider stored HTML entities (e.g. &#39; for single quotes, &amp; for ampersands) while the v5 provider uses the literal characters. The underlying values are identical — this is a cosmetic representation difference that resolves after the next terraform apply.
  Changes:
      module.access_rule.cloudflare_access_rule.long_notes: ~ notes         = "This is a very long note field that tests the handling of extended text content. It includes multiple sentences and various punctuation marks! Does the migration handle this correctly? We&#39;ll find out." -> "This is a very long note field that tests the handling of extended text content. It includes multiple sentences and various punctuation marks! Does the migration handle this correctly? We'll find out."
      module.access_rule.cloudflare_access_rule.special_chars: ~ notes         = "Block: &#34;suspicious&#34; IP with &#39;quotes&#39; &amp; special chars! #security" -> "Block: \"suspicious\" IP with 'quotes' & special chars! #security"

  Rule:    argo_tiered_caching_creation
  Reason:  The v4 cloudflare_argo resource has been split into two separate v5 resources: cloudflare_argo_smart_routing and cloudflare_argo_tiered_caching. The tiered caching resource does not exist in your v4 state and must be created fresh. This is expected — run terraform apply to create it and this change will not reappear.
  Changes:
      module.argo.cloudflare_argo_tiered_caching.both_with_lifecycle_tiered:   # module.argo.cloudflare_argo_tiered_caching.both_with_lifecycle_tiered will be created
      module.tiered_cache.cloudflare_argo_tiered_caching.generic_with_lifecycle:   # module.tiered_cache.cloudflare_argo_tiered_caching.generic_with_lifecycle will be created

  Rule:    computed_value_refreshes
  Reason:  Some attributes are marked as '(known after apply)' because their value is computed server-side and cannot be known until after the next apply. This is normal Terraform behavior and not caused by the migration. These changes are safe to ignore — they will resolve automatically after running terraform apply.
  Changes:
      module.custom_hostname.cloudflare_custom_hostname.no_settings: - settings              = {} -> null
      module.load_balancer.cloudflare_load_balancer.e2e_all_attributes: - pool_weights   = {} -> null
      module.load_balancer.cloudflare_load_balancer.e2e_random_steering: - pool_weights   = {} -> null

  Rule:    azure_conditional_access
  Reason:  After migration, the v5 provider's state upgrader does not carry forward the 'conditional_access_enabled' field for Azure identity providers. The plan will show this field being set to false. Your Azure integration is not affected — conditional access enforcement is unchanged. This drift resolves after the next terraform apply.
  Changes:
      module.zero_trust_access_identity_provider.cloudflare_zero_trust_access_identity_provider.azure: + conditional_access_enabled = false
      module.zero_trust_access_identity_provider.cloudflare_zero_trust_access_identity_provider.azure_full: + conditional_access_enabled = false

  Rule:    azure_support_groups
  Reason:  After migration, the v5 provider's state upgrader does not carry forward the 'support_groups' field for Azure identity providers. The plan will show this field being set to false. Your Azure group sync configuration is not affected. This drift resolves after the next terraform apply.
  Changes:
      module.zero_trust_access_identity_provider.cloudflare_zero_trust_access_identity_provider.azure_full: + support_groups             = false

  Rule:    saml_idp_public_certs
  Reason:  The v4 provider stored a single SAML IdP public certificate as 'idp_public_cert'. The v5 provider uses 'idp_public_certs' (a list). The state upgrader does not automatically convert the singular field to the list, so the plan will show 'idp_public_certs' as changing. Your SAML identity provider and its certificate are not affected. This drift resolves after the next terraform apply.
  Changes:
      module.zero_trust_access_identity_provider.cloudflare_zero_trust_access_identity_provider.saml: + idp_public_certs = [

  Rule:    enabled_entries_empty_to_null
  Reason:  The Cloudflare API returns 'enabled_entries' as an empty list ([]) for predefined DLP profiles even when no entries are explicitly enabled. The v5 provider represents an absent value as null in config. This causes a perpetual '[] -> null' difference that cannot be eliminated through configuration — the API contract does not allow both 'entries' and 'enabled_entries' to be set at the same time. This is a known provider behaviour and does not affect your DLP policy enforcement.
  Changes:
      module.zero_trust_dlp_predefined_profile.cloudflare_zero_trust_dlp_predefined_profile.credentials_and_secrets: - enabled_entries      = [] -> null

  Rule:    precedence_normalization
  Reason:  The v5 provider renumbers gateway policy precedence values during the state upgrade — large legacy values (e.g. 200895) are replaced with smaller sequential numbers. The relative ordering of your policies is preserved. This change is safe and resolves after the next terraform apply.
  Changes:
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.complex: ~ precedence     = 400412 -> 400
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.conditional_enabled[0]: ~ precedence     = 800155 -> 800
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.environment_policies["development"]: ~ precedence     = 602059 -> 602
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.environment_policies["production"]: ~ precedence     = 601460 -> 601
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.environment_policies["staging"]: ~ precedence     = 600373 -> 600
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.environment_policies["testing"]: ~ precedence     = 603214 -> 603
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.l4_override_detailed: ~ precedence     = 1400392 -> 1400
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.maximal: ~ precedence     = 150973 -> 150
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.minimal: ~ precedence     = 100508 -> 100
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.policy_configs["allow_internal"]: ~ precedence     = 5100191 -> 5100
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.policy_configs["audit_api"]: ~ precedence     = 5300988 -> 5300
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.policy_configs["block_malware"]: ~ precedence     = 5200486 -> 5200
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.simple_resolver: ~ precedence     = 500939 -> 500
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.tiered_policies[0]: ~ precedence     = 700278 -> 700
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.tiered_policies[1]: ~ precedence     = 710864 -> 710
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.tiered_policies[2]: ~ precedence     = 720586 -> 720
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_audit_ssh: ~ precedence     = 1500771 -> 1500
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_biso: ~ precedence     = 1700155 -> 1700
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_check_session: ~ precedence     = 1600636 -> 1600
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_interpolation: ~ precedence     = 1200615 -> 1200
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_join: ~ precedence     = 1000415 -> 1000
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_lifecycle: ~ precedence     = 1210432 -> 1210
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_nested_blocks: ~ precedence     = 300257 -> 300
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_override_ips: ~ precedence     = 1900477 -> 1900
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_payload_log: ~ precedence     = 1800037 -> 1800
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_prevent_destroy: ~ precedence     = 1300744 -> 1300
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_settings: ~ precedence     = 200765 -> 200

  Rule:    audit_ssh_computed_defaults
  Reason:  The 'command_logging' field inside an 'audit_ssh' block is a computed field that defaults to true when an SSH audit rule is created. The v5 provider always reads this value back from the API, which may cause it to appear in the plan even when you have not set it explicitly. Your SSH audit logging configuration is not affected.
  Changes:
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.complex: + audit_ssh                          = {
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.complex: + command_logging = true
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_audit_ssh: + audit_ssh                          = {
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_audit_ssh: + command_logging = true

  Rule:    duration_normalization
  Reason:  The v5 provider normalises duration values to Go's full duration format when reading from the API. For example, a value you set as '24h' in config will be read back as '24h0m0s'. This is a display-only difference — the actual duration enforced by the gateway policy is identical. This is a known v5 provider behaviour that may persist across plans.
  Changes:
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.complex: ~ duration = "24h0m0s" -> "24h"
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_check_session: ~ duration = "12h0m0s" -> "12h"

  Rule:    rule_settings_computed_defaults
  Reason:  The fields 'block_page_enabled' and 'ip_categories' are computed defaults set by the Cloudflare API and not required in your config. The v5 provider reads them back on every plan and may show them as changing to false. Your gateway policy rules and their enforcement are not affected.
  Changes:
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.simple_resolver: + block_page_enabled                 = false
      module.zero_trust_gateway_policy.cloudflare_zero_trust_gateway_policy.with_override_ips: + ip_categories                      = false

  Rule:    new_device_settings_resource
  Reason:  In v5, device settings have been extracted from cloudflare_teams_account into a dedicated cloudflare_zero_trust_device_settings resource. This resource does not exist in your v4 state so Terraform will plan to create it. This is expected — your device settings are carried forward correctly. Run terraform apply to create the resource; it will not reappear on subsequent plans.
  Changes:
      module.zero_trust_gateway_settings.cloudflare_zero_trust_device_settings.e2e_comprehensive_device_settings:   # module.zero_trust_gateway_settings.cloudflare_zero_trust_device_settings.e2e_comprehensive_device_settings will be created

  Rule:    new_logging_resource
  Reason:  In v5, gateway logging settings have been extracted from cloudflare_teams_account into a dedicated cloudflare_zero_trust_gateway_logging resource. This resource does not exist in your v4 state so Terraform will plan to create it. This is expected — your logging configuration is carried forward correctly. Run terraform apply to create the resource; it will not reappear on subsequent plans.
  Changes:
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_logging.e2e_comprehensive_logging:   # module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_logging.e2e_comprehensive_logging will be created

  Rule:    block_page_api_default_fields
  Reason:  The Cloudflare API always returns certain block_page fields (mode, include_context, suppress_footer, target_uri, mailto_address, mailto_subject) with default values even when you have not set them in your config. The v5 provider reads these back on every plan and may show them as being removed. This is a known v5 provider behaviour unrelated to the migration — your block page configuration and appearance are not affected.
  Changes:
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - include_context  = false -> null
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - mailto_address   = "" -> null
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - mailto_subject   = "" -> null
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - mode             = "customized_block_page" -> null
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - suppress_footer  = false -> null
      module.zero_trust_gateway_settings.cloudflare_zero_trust_gateway_settings.e2e_comprehensive: - target_uri       = "" -> null

  Rule:    precedence_recalculation
  Reason:  The Cloudflare API may automatically recalculate and reorder precedence values for device profiles during migration. The relative ordering of your profiles is preserved. This change is safe and resolves after the next terraform apply.
  Changes:
      module.zero_trust_local_fallback_domain.cloudflare_zero_trust_device_custom_profile.custom_e2e: ~ precedence                     = 876901 -> 1776

  Rule:    incomplete_access_block_removal
  Reason:  The v5 provider requires both 'aud_tag' and 'team_name' inside a tunnel ingress 'access' block. In v4, these fields were optional and could be omitted. tf-migrate removes any 'access' blocks from your config that are missing either field, because they cannot be represented in valid v5 configuration. An incomplete access block (missing aud_tag or team_name) does not actually enforce Access protection regardless — the Cloudflare Access app is unidentified without these values. The plan will show the access block being removed; this is correct and your tunnel continues to function.
  Changes:
      module.zero_trust_tunnel_cloudflared_config.cloudflare_zero_trust_tunnel_cloudflared_config.comprehensive: - access                   = {

  Rule:    status_pending_to_null
  Reason:  When DNSSEC is being activated, the Cloudflare API returns a status of 'pending' before the DS records have fully propagated. If your config does not explicitly set the status field, Terraform will show 'status = "pending" -> null'. This is expected during the activation window and does not indicate a problem. Once DNSSEC activation completes, this drift will disappear on its own.
  Changes:
      module.zone_dnssec.cloudflare_zone_dnssec.test: - status           = "pending" -> null

✓ No unexpected drift
────────────────────────────────────────────────────

====================================================
Result: ✓ MIGRATION LOOKS GOOD
  17 exemption rule(s) applied (58 expected change(s))
  No unexpected drift detected

…an validation

Adds `tf-migrate verify-drift --file <plan>` which reads a terraform plan
produced after a v4→v5 migration, auto-detects the Cloudflare resources
present, and checks every change against the bundled list of known migration
drift exemptions.

- New `internal/verifydrift` package: Verify(), groupExemptedLines(),
  parseExemptionTag(), PrintReport() with coloured output
- Embedded exemption YAML files (exemptions/) synced from e2e/ at build time
- Five exported functions added to internal/e2e-runner/drift.go:
  DetectResourcesFromPlan, CheckDriftWithConfig, ParseDriftExemptionsConfig,
  MergeAndCompileExemptions, LoadDriftExemptionsFromDir
- MergeAndCompileExemptions uses stable ordered merge with implicit resource
  type scoping and a sort pass so constrained exemptions are checked first,
  fixing a map-iteration ordering bug that caused wrong rule attribution
- make sync-exemptions target keeps embedded YAMLs in sync with e2e/ source;
  build and build-e2e both depend on it
- Updated all e2e/ drift exemption descriptions to be customer-facing
- Exit code 0 = clean, 1 = unexpected drift found (CI-friendly)
- README documents the workflow, example output, and command flags
@tamas-jozsa tamas-jozsa force-pushed the drift-exemption-verification branch from 3c9a2a7 to 12db0e5 Compare March 13, 2026 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants