Skip to content

perf(core): optimize parseInt/parseLong hot paths in Numbers.java#5

Open
misrasaurabh1 wants to merge 6 commits intomasterfrom
codeflash/optimize-number-parsing
Open

perf(core): optimize parseInt/parseLong hot paths in Numbers.java#5
misrasaurabh1 wants to merge 6 commits intomasterfrom
codeflash/optimize-number-parsing

Conversation

@misrasaurabh1
Copy link
Copy Markdown

Summary

Optimize the inner loops of parseInt0 and parseLong0 in Numbers.java to reduce branch overhead on the digit path, which is taken on every character of a valid number string.

Both functions are called from 334 sites across the codebase (265 parseInt, 69 parseLong), making them a recurring cost in SQL parsing, CSV import, ILP ingestion, and config loading.

Changes

parseInt0 — reorder if-else-if so the digit range check comes first.

Before:

if (c == '_') { ... }           // checked first, rarely true
else if (c < '0' || c > '9') { ... }
else { /* digit — hot path */ }

After:

if (c >= '0' && c <= '9') { /* digit — hot path */ }
else if (c == '_') { ... }
else { throw; }

For a digit, this reduces 3 comparisons to 2 per loop iteration.

parseLong0 — replace the switch (c | 32) loop with an if-else-if chain, putting the digit check first.

Before:

switch (c | 32) {          // bitwise OR applied to every character
    case 'l':  ...
    case 127:  ...          // '_' after OR
    default:
        if (c < '0' || c > '9') throw;
        /* digit */
}

After:

if (c >= '0' && c <= '9') { /* digit */ }
else if (c == '_') { ... }
else if (c == 'l' || c == 'L') { ... }   // no OR needed
else { throw; }

For a digit, this reduces 5 operations (1 OR + 2 switch comparisons + 2 range checks) to 2 comparisons.

The L-suffix validity check is also tightened from i == 0 (incorrect for non-zero start offsets) to digitCounter == 0, which correctly rejects 'L' appearing before any digits regardless of starting position.

Evidence

Informal micro-benchmark (10M iterations, 1000 varied random inputs, JVM warmed):

Operation Before After Delta
parseLong 10-digit 22.3 ns/op 19.8 ns/op -11%
parseLong 3-digit 12.0 ns/op 9.3 ns/op -22%
parseInt 9-digit 20.7 ns/op 19.0 ns/op -8%

Benchmark method: 10M iterations over 1000 varied random strings, JVM warmed with 200K iterations first.

Test plan

  • NumbersTest — 195 tests, all pass
  • SqlLexerTest, SqlParserTest — 1000 tests, all pass
  • CastTest, FunctionParserTest — 522 tests, all pass

Put the digit check first in parseInt0 and replace the switch-based
parseLong0 loop with an if-else-if chain that tests digits first.

For parseInt0, the previous ordering checked for underscore before
digits, costing one extra branch per character in the common case.
Reordering to check c >= '0' && c <= '9' first reduces the hot-path
branch count from 3 comparisons to 2 per digit.

For parseLong0, the switch on (c | 32) applied a bitwise OR on every
character, including digits, to unify 'l' and 'L' case handling.
Replacing with an if-else-if avoids the OR entirely for digits.  Direct
character comparisons (c == 'l' || c == 'L') cover the L-suffix case.
The semantic of the L-position check is tightened from (i == 0) to
(digitCounter == 0) which correctly rejects 'L' after separators and
also works when the start offset p is non-zero.

Benchmark (10M iterations, 1000 varied inputs, JVM warm):
  parseLong 10-digit:  22.3 → 19.8 ns/op  (-11%)
  parseLong 3-digit:   12.0 →  9.3 ns/op  (-22%)
  parseInt  9-digit:   20.7 → 19.0 ns/op  (-8%)
@misrasaurabh1 misrasaurabh1 added enhancement New feature or request java Pull requests that update java code labels Feb 25, 2026
misrasaurabh1 and others added 5 commits February 25, 2026 12:16
Eliminates 4x Chars.indexOf + 4x parseInt calls (8 passes over input)
with a single character scan, yielding ~1.6x speedup in benchmarks.
Each octet is accumulated directly in the main loop, with leading/trailing
dot handling preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…arsing

Replace 5 parseHexLong() calls (with loop overhead per call) with direct
unrolled hexToDecimal() calls at fixed UUID positions. Benchmarks show
~1.35x speedup, benefiting UUID column comparisons in WHERE clauses and
UUID field ingestion via line protocol.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… extraction

Replace 5 generic appendHexPadded calls (each requiring Long.numberOfLeadingZeros,
padding loop, and virtual dispatch) with 32 direct bit-shift nibble extractions
into hexDigits. Benchmarked at 1.60x speedup (50.87 → 31.73 ns/op).
Replace the pattern of calling getIPv4Netmask() (which does indexOf internally)
followed by a second indexOf call for the same '/' position. A single indexOf
call now serves both netmask extraction and IP address parsing.
…bble extraction

Replace appendHex(sink, x, true) calls in appendLong256Two/Three/Four with
appendHex16 - 16 direct bit-shift+hexDigits lookups - eliminating Long.numberOfLeadingZeros,
longHexAppenderPad64 array lookup, and virtual dispatch overhead. ~1.11x speedup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java Pull requests that update java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant