Skip to content

Number parsing possible optimization #2069

@pps83

Description

@pps83

I had some quick json parsing code and wanted to compare with glaze. For some reason my code was over 2x faster. Then, I tried to use fast_float instead of custom number parsing. My code got twice slower. I changed it to use std::from_chars and it appears that parsing with std::from_chars is much faster with VS2022 on x64 (but it still was slower than my custom code).

These are my measured timings:

     4.031ms  `processJsonArray_cpp`
     5.084ms  `processJsonArray_cpp_std_from_chars`
     9.033ms  `processJsonArray_cpp_fastfloat_from_chars`
    10.333ms  `processJsonArray_glz`

testing data is a large array of json object that look like this one:

{"ev":"A","sym":"HOOD","v":789,"vw":14288,"o":14294,"c":14294,"h":14294,"l":14294,"z":17,"s":1762246800000}

I parse it all to this structure:

struct AggregateBar
{
    std::string_view sym;
    unsigned v; // volume
    unsigned vw;
    unsigned o, c, h, l; // open, close, high, low
    unsigned z; // averageTradeSize
    long long s;
};

I saw that glaze uses fast_float, and I wanted to see if I could make it use std instead and possibly get faster parsing results. But it appears that glaze uses both std::from_chars and glz::from_chars for parsing. I assume that glz::from_chars uses fast_float and looked into the code.

I saw this pattern used all over the place:

         while (glz::fast_float::is_integer(*p)) {
            // a multiplication by 10 is cheaper than an arbitrary integer multiplication
            i = 10 * i + uint64_t(*p - UC('0')); // might overflow, we will handle the overflow later
            ++p;
         }

IMO, this could be rewritten like this:

static inline unsigned dv(char c) // get digit value
{
    return unsigned(c - '0');
}
unsigned n;
while ((n = dv(*p)) <= 9)
{
    i = 10 * i + n;
    ++p;
}

this reduces the code for one less branch when parsing each digit. Instead of checking each char for !(c > '9' || c < '0'), unsigned(c - '0') converts to an unsigned n that will be <=9 for valid digits. Not sure if this optimization is auto deduced by compilers nowadays. Also, uint64_t(*p - UC('0') think might not be optimal for 32-bit CPUs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions