Skip to content

ext/uri: speed up Uri\Rfc3986\Uri component reads (~22%)#30

Closed
iliaal wants to merge 1 commit intomasterfrom
perf/uri-rfc3986-reads
Closed

ext/uri: speed up Uri\Rfc3986\Uri component reads (~22%)#30
iliaal wants to merge 1 commit intomasterfrom
perf/uri-rfc3986-reads

Conversation

@iliaal
Copy link
Copy Markdown
Owner

@iliaal iliaal commented Apr 11, 2026

Summary

Speed up Uri\Rfc3986\Uri parse + component reads by ~22% on a
realistic URL mix. Five targeted changes in ext/uri/uri_parser_rfc3986.c;
the full per-change breakdown is in the commit message.

Benchmark

17 URL shapes: plain http/https, deep paths, with query/fragment, with
userinfo, IPv4, IPv6, mailto, URN, data, file, relative. 100K iterations
per run, 1.7M parses per benchmark, CPU pinned via taskset -c 0,
same-session A/B (stash + rebuild + rerun each direction).

baseline optimized delta
Uri\Rfc3986\Uri::parse only 0.394s (4.30M/s) 0.378s (4.50M/s) −4%
Uri\Rfc3986\Uri::parse + 7 reads 0.842s (2.02M/s) 0.653s (2.60M/s) −22% / +29% throughput
Uri\WhatWg\Url::parse only 0.410s (4.15M/s) 0.408s (4.16M/s) noise
Uri\WhatWg\Url::parse + 8 reads 0.599s (2.84M/s) 0.592s (2.89M/s) noise

The WhatWG rows are a control. This patch only touches the RFC3986
backend, so WhatWG throughput should stay within noise, and it does.

What's in the patch

  1. get_normalized_uri() aliases the raw URI when
    uriNormalizeSyntaxMaskRequiredExA reports a zero dirty mask,
    skipping the full uriCopyUriMmA + uriNormalizeSyntaxExMmA pass
    for the common case where the parsed URI is already canonical. The
    mask is cached so repeated non-raw reads on one instance scan once.
    Biggest single contributor.

  2. Port cache on the uris struct. Parse-time validation stashes the
    converted zend_long; the first port_read serves from cache;
    writes invalidate.

  3. port_str_to_zend_long_checked swaps stack-copy + ZEND_STRTOUL
    for an inline digit accumulator. Uriparser has already validated
    the text as ASCII digits only, so strtoul's parse-flags, sign
    handling, base detection, and stop-char scanning are dead weight.

  4. uriparser_create_uris uses emalloc + targeted field init
    instead of ecalloc, avoiding ~440 bytes of zeroing per parse.

  5. php_uri_parser_rfc3986_destroy skips
    uriFreeUriMembersMmA(&normalized_uri, ...) when it was never
    built or when it aliases the raw uri. Required by the emalloc
    change since the struct is no longer fully zeroed.

Five related changes to uri_parser_rfc3986.c that together cut parse
+ 7 reads on a 17-URL mix from 0.842s to 0.653s (1.7M parses, pinned
to a single CPU). That's a 22% wall-time reduction and a 29% throughput
increase. Parse-only moves from 0.394s to 0.378s, about 4%.

1. get_normalized_uri() now aliases the raw URI when nothing requires
   normalization. uriNormalizeSyntaxMaskRequiredExA reports which
   components need rewriting; a zero mask means the parsed URI is
   already canonical and the code skips the uriCopyUriMmA deep copy
   plus the full uriNormalizeSyntaxExMmA pass. This is the biggest
   single contributor. The dirty mask is cached on the struct so
   multiple non-raw reads on the same instance run the scan once.

2. The port now lives in a cache on the uris struct. The parse path
   stashes the converted zend_long directly, so the first port_read
   serves it without re-scanning. Subsequent reads short-circuit. The
   write path invalidates the cache.

3. port_str_to_zend_long_checked replaces its stack-copy + ZEND_STRTOUL
   with an inline digit accumulator. Uriparser has already validated
   that the port text is ASCII digits only, so the branch-heavy strtoul
   path is unnecessary.

4. uriparser_create_uris uses emalloc + targeted field init instead of
   ecalloc. The struct is ~440 bytes. We overwrite the uri member right
   after this function returns and we only touch normalized_uri once
   the init flag becomes true, so only the flag fields and dirty_mask
   need zeroing.

5. php_uri_parser_rfc3986_destroy skips uriFreeUriMembersMmA on
   normalized_uri when it was never built or when it aliases the raw
   uri. Paired with the emalloc change: the struct used to be fully
   zeroed by ecalloc, so the free was safe without a guard. Now that
   most of the struct is uninitialized, the guard is required.

No behavior change. All 309 tests in ext/uri/tests pass. I also checked
that URIs which need normalization (http://EXAMPLE.com/A/%2e%2e/c
resolving to /c) still hit the full normalize path, so the alias
shortcut is gated by a non-zero dirty mask.
@iliaal iliaal force-pushed the perf/uri-rfc3986-reads branch from 1951c4c to b5d6508 Compare April 11, 2026 19:55
@iliaal
Copy link
Copy Markdown
Owner Author

iliaal commented Apr 11, 2026

Submitted upstream as php#21726. Closing this staging PR.

@iliaal iliaal closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant