Skip to content

Coverage gap: 87% of slow queries invisible to observation subsystem #43

Description

@mrtwebdesign

Summary

Cross-referencing QueryGuard logs against WPEngine's pt-query-digest for Jun 19, 2026 reveals that QueryGuard logged zero slow queries while MySQL recorded 801 slow queries / 6,568 seconds during the same 12-hour window. Of those, 781 queries (5,731s / 87%) are application-level queries that go through $wpdb->query() and should be caught by the db.php drop-in's timing mechanism.

The observation subsystem IS configured correctly — HYPERCART_QUERY_GUARD_MODE = 'observe', db.php drop-in is installed at wp-content/db.php, and HCQG_DB::query() wraps every query with microtime(true) timing. The code path is sound. The gap is environmental and currently invisible because the plugin has no self-diagnostic for its observation path.

Evidence

QueryGuard Jun 18: 36 slow queries logged (observation was working)
QueryGuard Jun 19: 0 slow queries logged
pt-query-digest Jun 19 (12h window): 801 slow queries, 6,568 seconds

Missed queries by category

Pattern Calls DB Time Goes through $wpdb? Catchable?
wp_mail_smtp AS queue check 649 3,810s Yes Should be
Admin postmeta LIKE search 94 1,301s Yes Should be
WPE backup/maintenance (10.156.2.80) 20 837s No — direct MySQL Never
WC Analytics dashboard 4 171s Yes Should be
WCS cron reports 3 91s Yes Should be
Other (order list, KISS, etc.) 31 358s Yes Should be

781 application-layer queries (5,731s) go through $wpdb and should be caught. 20 WPEngine infrastructure queries (837s) are direct MySQL connections that bypass PHP entirely — this is a known structural limitation that cannot be fixed.

Code path analysis

Production is running v1.2.0 with the v2 db.php drop-in (DROPIN_VERSION 2.0.0).

At bootstrap (db.php loaded by require_wp_db()):

  • $wpdb = new HCQG_DB(...) — line 221
  • hcqg_active = ( 'off' !== 'observe' )true — line 91
  • Every $wpdb->query() call is timed via microtime(true) — lines 143-145
  • Queries >= 5.0s stored in $wpdb->hcqg_slow_queries[] — lines 148-156

At mu-plugin init (hypercart-query-guard.php:213):

  • dropin_active() checks method_exists($wpdb, 'hcqg_is_active') && $wpdb->hcqg_is_active() — line 207
  • If true: registers log_slow_queries() on shutdown at 100% of requests — lines 256-259

At shutdown (log_slow_queries() line 1200):

  • Calls dropin_active() again — if true, reads $wpdb->hcqg_slow_queries
  • Falls through to v1 SAVEQUERIES path if drop-in not active — line 1211
  • Returns silently with no diagnostic if neither path has data — line 1214

The silent return at line 1214 is the visibility problem. If dropin_active() returns false at shutdown (even though it returned true at init), or if $wpdb is no longer HCQG_DB, the function exits silently. There is no log entry to indicate the observation path failed.

Hypotheses for the environmental gap

Ranked by likelihood. Cannot distinguish between them without diagnostic instrumentation on the production WPEngine environment.

1. WPEngine manages the db.php drop-in slot

WPEngine's deployment/container lifecycle may restore, shadow, or overlay wp-content/db.php during deploys, container restarts, or PHP-FPM pool recycling. If this happens, $wpdb reverts to standard wpdb and dropin_active() returns false. The v1 fallback path then runs with 5% sampling, and even that fails silently if SAVEQUERIES is already defined as false by another component.

2. Platform mu-plugins replace $wpdb

WPEngine's 0-worker.php or platform mu-plugins may replace the $wpdb global after db.php instantiates it. The class_exists guard at init time would still pass, but $wpdb->hcqg_slow_queries would be an empty array on a standard wpdb instance (undefined property, returns null, empty() check at line 1217 catches it).

3. PHP worker recycling kills the process before shutdown

WPEngine may SIGKILL long-running PHP workers. If the requests that have slow queries are exactly the ones that exceed the PHP execution limit, they die before the shutdown hook fires. This would create a selection bias: the slowest requests (the ones with >5s queries) are the ones most likely to be killed before logging.

4. Database proxy timing discrepancy

WPEngine's database proxy layer may cache or short-circuit query results, so PHP's wall-clock time is <5s even though MySQL's server-side execution time is >5s (as reported by pt-query-digest). This would mean the db.php timing is working correctly but the queries genuinely don't appear slow from PHP's perspective.

What needs to happen

1. Add observation self-diagnostic (priority: high)

Add a lightweight diagnostic to log_slow_queries() that logs which code path is taken on each request (or sampled requests). This will immediately reveal the root cause on the next pt-query-digest comparison. Specifically:

  • Whether dropin_active() returns true or false at shutdown
  • What class $wpdb actually is (HCQG_DB vs wpdb vs other)
  • Count of hcqg_slow_queries entries captured
  • Whether SAVEQUERIES is defined and its value

2. Harden the v1 fallback path

If dropin_active() returns false unexpectedly, the v1 path at line 1211 requires defined('SAVEQUERIES') && SAVEQUERIES. If another component (WPEngine platform, performance plugin) pre-defines SAVEQUERIES = false, the observation silently fails. The init-time code at line 261 (if ( ! defined( 'SAVEQUERIES' ) )) cannot override an already-defined constant.

3. Structural gap: WPEngine infrastructure queries

20 queries / 837s from WPEngine's backup/maintenance infrastructure (host 10.156.2.80, user _wpe_root) are direct MySQL connections that never touch PHP. This is an inherent limitation of PHP-level instrumentation. Document this as a known blind spot and factor it into any cross-reference analysis.

Production config reference

define( 'HYPERCART_QUERY_GUARD_MODE', 'observe' );
define( 'HYPERCART_QUERY_GUARD_THROTTLE_MODE', 'test_observe' );
define( 'HYPERCART_QUERY_GUARD_LOG_ORDER_NOTES', true );

Source data

  • QueryGuard logs: Jun 18–19, 2026 (query-guard-logs-06-19-2026/)
  • WPEngine pt-query-digest: Jun 19, 2026, 12:07 AM – 11:59 AM PT
  • Full analysis: query-guard-48hr-analysis-06-18-19.md
  • Production plugin version: v1.2.0 (hypercart-query-guard.php, 1,472 lines)
  • Production drop-in version: v2.0.0 (db.php, HCQG_DB)

Investigation by Claude Code, June 19, 2026.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions