ext/uri: speed up Uri\Rfc3986\Uri component reads (~22%)#30
Closed
ext/uri: speed up Uri\Rfc3986\Uri component reads (~22%)#30
Conversation
Five related changes to uri_parser_rfc3986.c that together cut parse + 7 reads on a 17-URL mix from 0.842s to 0.653s (1.7M parses, pinned to a single CPU). That's a 22% wall-time reduction and a 29% throughput increase. Parse-only moves from 0.394s to 0.378s, about 4%. 1. get_normalized_uri() now aliases the raw URI when nothing requires normalization. uriNormalizeSyntaxMaskRequiredExA reports which components need rewriting; a zero mask means the parsed URI is already canonical and the code skips the uriCopyUriMmA deep copy plus the full uriNormalizeSyntaxExMmA pass. This is the biggest single contributor. The dirty mask is cached on the struct so multiple non-raw reads on the same instance run the scan once. 2. The port now lives in a cache on the uris struct. The parse path stashes the converted zend_long directly, so the first port_read serves it without re-scanning. Subsequent reads short-circuit. The write path invalidates the cache. 3. port_str_to_zend_long_checked replaces its stack-copy + ZEND_STRTOUL with an inline digit accumulator. Uriparser has already validated that the port text is ASCII digits only, so the branch-heavy strtoul path is unnecessary. 4. uriparser_create_uris uses emalloc + targeted field init instead of ecalloc. The struct is ~440 bytes. We overwrite the uri member right after this function returns and we only touch normalized_uri once the init flag becomes true, so only the flag fields and dirty_mask need zeroing. 5. php_uri_parser_rfc3986_destroy skips uriFreeUriMembersMmA on normalized_uri when it was never built or when it aliases the raw uri. Paired with the emalloc change: the struct used to be fully zeroed by ecalloc, so the free was safe without a guard. Now that most of the struct is uninitialized, the guard is required. No behavior change. All 309 tests in ext/uri/tests pass. I also checked that URIs which need normalization (http://EXAMPLE.com/A/%2e%2e/c resolving to /c) still hit the full normalize path, so the alias shortcut is gated by a non-zero dirty mask.
1951c4c to
b5d6508
Compare
Owner
Author
|
Submitted upstream as php#21726. Closing this staging PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Speed up
Uri\Rfc3986\Uriparse + component reads by ~22% on arealistic URL mix. Five targeted changes in
ext/uri/uri_parser_rfc3986.c;the full per-change breakdown is in the commit message.
Benchmark
17 URL shapes: plain http/https, deep paths, with query/fragment, with
userinfo, IPv4, IPv6, mailto, URN, data, file, relative. 100K iterations
per run, 1.7M parses per benchmark, CPU pinned via
taskset -c 0,same-session A/B (stash + rebuild + rerun each direction).
Uri\Rfc3986\Uri::parseonlyUri\Rfc3986\Uri::parse+ 7 readsUri\WhatWg\Url::parseonlyUri\WhatWg\Url::parse+ 8 readsThe WhatWG rows are a control. This patch only touches the RFC3986
backend, so WhatWG throughput should stay within noise, and it does.
What's in the patch
get_normalized_uri()aliases the raw URI whenuriNormalizeSyntaxMaskRequiredExAreports a zero dirty mask,skipping the full
uriCopyUriMmA+uriNormalizeSyntaxExMmApassfor the common case where the parsed URI is already canonical. The
mask is cached so repeated non-raw reads on one instance scan once.
Biggest single contributor.
Port cache on the uris struct. Parse-time validation stashes the
converted
zend_long; the firstport_readserves from cache;writes invalidate.
port_str_to_zend_long_checkedswaps stack-copy +ZEND_STRTOULfor an inline digit accumulator. Uriparser has already validated
the text as ASCII digits only, so
strtoul's parse-flags, signhandling, base detection, and stop-char scanning are dead weight.
uriparser_create_urisusesemalloc+ targeted field initinstead of
ecalloc, avoiding ~440 bytes of zeroing per parse.php_uri_parser_rfc3986_destroyskipsuriFreeUriMembersMmA(&normalized_uri, ...)when it was neverbuilt or when it aliases the raw uri. Required by the
emallocchange since the struct is no longer fully zeroed.