Has any thought been given to abstracting over the input type? I'm thinking specifically of binary inputs like Array[Byte], fs2.Chunk, scodec.bits.ByteVector, or java.nio.ByteBuffer. I'm struggling to compete with an HTTP/1 parser that works on Array[Byte].
The obvious answer is scodec. The old fs2-http parser built on it, while beautiful, is also much slower. I'm dreaming of a scodec with the cats-parse mutability trick.
I spiked on it a bit. Problems I encountered:
- To remain compatible, we have to do unspeakable things with the
String and Char based parsers when the underlying type is binary. If we added binary parsers, we'd have to do unspeakable things when the underlying type is characters.
- Some desired types, like
BitVector, are Long-indexed instead of Int. This ripples at least into State and Expectation.
- A typeclass for inputs seems the right way, but maintaining compatibility gets even harder.
- All of this could be overcome with a parallel
BinParser and BinParser0, but the duplication is an awful shame. It might not even be the same library anymore.
A more modest abstraction is to accept CharSequence as input, at which point we can wrap binary inputs with something like Netty's AsciiString. It's still abusive with respect to Char vs. Byte. It also doesn't help with HTTP/2, where we might benefit from a BitVector.
This is probably all a terrible idea, but I thought I'd ask.
Has any thought been given to abstracting over the input type? I'm thinking specifically of binary inputs like
Array[Byte],fs2.Chunk,scodec.bits.ByteVector, orjava.nio.ByteBuffer. I'm struggling to compete with an HTTP/1 parser that works onArray[Byte].The obvious answer is scodec. The old fs2-http parser built on it, while beautiful, is also much slower. I'm dreaming of a scodec with the cats-parse mutability trick.
I spiked on it a bit. Problems I encountered:
StringandCharbased parsers when the underlying type is binary. If we added binary parsers, we'd have to do unspeakable things when the underlying type is characters.BitVector, areLong-indexed instead ofInt. This ripples at least intoStateandExpectation.BinParserandBinParser0, but the duplication is an awful shame. It might not even be the same library anymore.A more modest abstraction is to accept
CharSequenceas input, at which point we can wrap binary inputs with something like Netty'sAsciiString. It's still abusive with respect toCharvs.Byte. It also doesn't help with HTTP/2, where we might benefit from aBitVector.This is probably all a terrible idea, but I thought I'd ask.