perf(core): optimize parseInt/parseLong hot paths in Numbers.java by misrasaurabh1 · Pull Request #5 · codeflash-ai/questdb

misrasaurabh1 · 2026-02-25T19:01:08Z

Summary

Optimize the inner loops of parseInt0 and parseLong0 in Numbers.java to reduce branch overhead on the digit path, which is taken on every character of a valid number string.

Both functions are called from 334 sites across the codebase (265 parseInt, 69 parseLong), making them a recurring cost in SQL parsing, CSV import, ILP ingestion, and config loading.

Changes

parseInt0 — reorder if-else-if so the digit range check comes first.

Before:

if (c == '_') { ... }           // checked first, rarely true
else if (c < '0' || c > '9') { ... }
else { /* digit — hot path */ }

After:

if (c >= '0' && c <= '9') { /* digit — hot path */ }
else if (c == '_') { ... }
else { throw; }

For a digit, this reduces 3 comparisons to 2 per loop iteration.

parseLong0 — replace the switch (c | 32) loop with an if-else-if chain, putting the digit check first.

Before:

switch (c | 32) {          // bitwise OR applied to every character
    case 'l':  ...
    case 127:  ...          // '_' after OR
    default:
        if (c < '0' || c > '9') throw;
        /* digit */
}

After:

if (c >= '0' && c <= '9') { /* digit */ }
else if (c == '_') { ... }
else if (c == 'l' || c == 'L') { ... }   // no OR needed
else { throw; }

For a digit, this reduces 5 operations (1 OR + 2 switch comparisons + 2 range checks) to 2 comparisons.

The L-suffix validity check is also tightened from i == 0 (incorrect for non-zero start offsets) to digitCounter == 0, which correctly rejects 'L' appearing before any digits regardless of starting position.

Evidence

Informal micro-benchmark (10M iterations, 1000 varied random inputs, JVM warmed):

Operation	Before	After	Delta
`parseLong` 10-digit	22.3 ns/op	19.8 ns/op	-11%
`parseLong` 3-digit	12.0 ns/op	9.3 ns/op	-22%
`parseInt` 9-digit	20.7 ns/op	19.0 ns/op	-8%

Benchmark method: 10M iterations over 1000 varied random strings, JVM warmed with 200K iterations first.

Test plan

NumbersTest — 195 tests, all pass
SqlLexerTest, SqlParserTest — 1000 tests, all pass
CastTest, FunctionParserTest — 522 tests, all pass

Put the digit check first in parseInt0 and replace the switch-based parseLong0 loop with an if-else-if chain that tests digits first. For parseInt0, the previous ordering checked for underscore before digits, costing one extra branch per character in the common case. Reordering to check c >= '0' && c <= '9' first reduces the hot-path branch count from 3 comparisons to 2 per digit. For parseLong0, the switch on (c | 32) applied a bitwise OR on every character, including digits, to unify 'l' and 'L' case handling. Replacing with an if-else-if avoids the OR entirely for digits. Direct character comparisons (c == 'l' || c == 'L') cover the L-suffix case. The semantic of the L-position check is tightened from (i == 0) to (digitCounter == 0) which correctly rejects 'L' after separators and also works when the start offset p is non-zero. Benchmark (10M iterations, 1000 varied inputs, JVM warm): parseLong 10-digit: 22.3 → 19.8 ns/op (-11%) parseLong 3-digit: 12.0 → 9.3 ns/op (-22%) parseInt 9-digit: 20.7 → 19.0 ns/op (-8%)

Eliminates 4x Chars.indexOf + 4x parseInt calls (8 passes over input) with a single character scan, yielding ~1.6x speedup in benchmarks. Each octet is accumulated directly in the main loop, with leading/trailing dot handling preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…arsing Replace 5 parseHexLong() calls (with loop overhead per call) with direct unrolled hexToDecimal() calls at fixed UUID positions. Benchmarks show ~1.35x speedup, benefiting UUID column comparisons in WHERE clauses and UUID field ingestion via line protocol. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… extraction Replace 5 generic appendHexPadded calls (each requiring Long.numberOfLeadingZeros, padding loop, and virtual dispatch) with 32 direct bit-shift nibble extractions into hexDigits. Benchmarked at 1.60x speedup (50.87 → 31.73 ns/op).

Replace the pattern of calling getIPv4Netmask() (which does indexOf internally) followed by a second indexOf call for the same '/' position. A single indexOf call now serves both netmask extraction and IP address parsing.

…bble extraction Replace appendHex(sink, x, true) calls in appendLong256Two/Three/Four with appendHex16 - 16 direct bit-shift+hexDigits lookups - eliminating Long.numberOfLeadingZeros, longHexAppenderPad64 array lookup, and virtual dispatch overhead. ~1.11x speedup.

misrasaurabh1 added enhancement New feature or request java Pull requests that update java code labels Feb 25, 2026

misrasaurabh1 and others added 5 commits February 25, 2026 12:16

Optimize getIPv4Subnet: eliminate redundant Chars.indexOf call

3a2c24c

Replace the pattern of calling getIPv4Netmask() (which does indexOf internally) followed by a second indexOf call for the same '/' position. A single indexOf call now serves both netmask extraction and IP address parsing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(core): optimize parseInt/parseLong hot paths in Numbers.java#5

perf(core): optimize parseInt/parseLong hot paths in Numbers.java#5
misrasaurabh1 wants to merge 6 commits intomasterfrom
codeflash/optimize-number-parsing

misrasaurabh1 commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

misrasaurabh1 commented Feb 25, 2026

Summary

Changes

Evidence

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant