I had some quick json parsing code and wanted to compare with glaze. For some reason my code was over 2x faster. Then, I tried to use fast_float instead of custom number parsing. My code got twice slower. I changed it to use std::from_chars and it appears that parsing with std::from_chars is much faster with VS2022 on x64 (but it still was slower than my custom code).
These are my measured timings:
4.031ms `processJsonArray_cpp`
5.084ms `processJsonArray_cpp_std_from_chars`
9.033ms `processJsonArray_cpp_fastfloat_from_chars`
10.333ms `processJsonArray_glz`
testing data is a large array of json object that look like this one:
{"ev":"A","sym":"HOOD","v":789,"vw":14288,"o":14294,"c":14294,"h":14294,"l":14294,"z":17,"s":1762246800000}
I parse it all to this structure:
struct AggregateBar
{
std::string_view sym;
unsigned v; // volume
unsigned vw;
unsigned o, c, h, l; // open, close, high, low
unsigned z; // averageTradeSize
long long s;
};
I saw that glaze uses fast_float, and I wanted to see if I could make it use std instead and possibly get faster parsing results. But it appears that glaze uses both std::from_chars and glz::from_chars for parsing. I assume that glz::from_chars uses fast_float and looked into the code.
I saw this pattern used all over the place:
while (glz::fast_float::is_integer(*p)) {
// a multiplication by 10 is cheaper than an arbitrary integer multiplication
i = 10 * i + uint64_t(*p - UC('0')); // might overflow, we will handle the overflow later
++p;
}
IMO, this could be rewritten like this:
static inline unsigned dv(char c) // get digit value
{
return unsigned(c - '0');
}
unsigned n;
while ((n = dv(*p)) <= 9)
{
i = 10 * i + n;
++p;
}
this reduces the code for one less branch when parsing each digit. Instead of checking each char for !(c > '9' || c < '0'), unsigned(c - '0') converts to an unsigned n that will be <=9 for valid digits. Not sure if this optimization is auto deduced by compilers nowadays. Also, uint64_t(*p - UC('0') think might not be optimal for 32-bit CPUs.
I had some quick json parsing code and wanted to compare with glaze. For some reason my code was over 2x faster. Then, I tried to use fast_float instead of custom number parsing. My code got twice slower. I changed it to use
std::from_charsand it appears that parsing with std::from_chars is much faster with VS2022 on x64 (but it still was slower than my custom code).These are my measured timings:
testing data is a large array of json object that look like this one:
I parse it all to this structure:
I saw that glaze uses fast_float, and I wanted to see if I could make it use std instead and possibly get faster parsing results. But it appears that glaze uses both
std::from_charsandglz::from_charsfor parsing. I assume thatglz::from_charsuses fast_float and looked into the code.I saw this pattern used all over the place:
IMO, this could be rewritten like this:
this reduces the code for one less branch when parsing each digit. Instead of checking each char for
!(c > '9' || c < '0'),unsigned(c - '0')converts to an unsignednthat will be<=9for valid digits. Not sure if this optimization is auto deduced by compilers nowadays. Also,uint64_t(*p - UC('0')think might not be optimal for 32-bit CPUs.