Skip to content

TurboQuant again!#7829

Open
connortsui20 wants to merge 11 commits intodevelopfrom
ct/tq-pull-out
Open

TurboQuant again!#7829
connortsui20 wants to merge 11 commits intodevelopfrom
ct/tq-pull-out

Conversation

@connortsui20
Copy link
Copy Markdown
Contributor

@connortsui20 connortsui20 commented May 7, 2026

Summary

Tracking issue: #7830

Moves TurboQuant out of vortex-tensor into a new vortex-turboquant crate.

The first commit was mostly copying and pasting a bunch of code, as well as adding the unpack method to replace canonicalization. The second commit was cleaning up everything holistically.

A lot of the code in vortex-tensor was reviewed pretty lightly because we knew that it was unstable, but now that we are more certain about the implementation (not necessarily about the exact design, but the actual implementation of the TQ algorithms), I think it is worth reviewing everything as a whole.

Testing

These tests were mostly there before, but now there are more!

@connortsui20 connortsui20 changed the title Ct/tq pull out TurboQuant again! May 7, 2026
@connortsui20 connortsui20 added the changelog/feature A new feature label May 7, 2026
@connortsui20 connortsui20 mentioned this pull request May 7, 2026
7 tasks
@connortsui20 connortsui20 force-pushed the ct/tq-pull-out branch 2 times, most recently from 507349c to 51f347e Compare May 7, 2026 17:32
Comment thread vortex-turboquant/src/lib.rs Outdated
Comment thread vortex-turboquant/src/vector/normalize.rs Outdated
@connortsui20 connortsui20 force-pushed the ct/tq-pull-out branch 2 times, most recently from 25c7339 to f473276 Compare May 7, 2026 21:54
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 7, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 1208 untouched benchmarks


Comparing ct/tq-pull-out (959e1ae) with develop (31840e2)

Open in CodSpeed

@connortsui20 connortsui20 marked this pull request as ready for review May 7, 2026 22:15
@connortsui20 connortsui20 requested a review from gatesn May 7, 2026 22:16
Comment on lines +158 to +177
impl ScalarFnArrayVTable for TQPack {
fn serialize(
&self,
_view: &ScalarFnArrayView<Self>,
_session: &VortexSession,
) -> VortexResult<Option<Vec<u8>>> {
Ok(None)
}

fn deserialize(
&self,
_dtype: &DType,
_len: usize,
_metadata: &[u8],
_children: &dyn ArrayChildren,
_session: &VortexSession,
) -> VortexResult<ScalarFnArrayParts<Self>> {
vortex_bail!("TQPack scalar-fn arrays are not serializable")
}
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to serialize a TQPack array? It doesn't really make a lot of sense (though it does for TQUnpack as it means we can store the TQ compressed array and get back a Vector array)

@connortsui20
Copy link
Copy Markdown
Contributor Author

codex has some findings that I will have to investigate tomorrow:

Details

[P1] Lazy TQPack still cannot be used as the write-time path. initialize() only registers the TQUnpack array plugin, and TQPack’s array serialization returns None / deserialization bails, so writing TQPack::try_new_array(...).into_array() will fail before pack_vector runs. The file test currently writes an already-executed TurboQuant array, so it does not cover the actual lazy write path. See lib.rs (line 81), pack.rs (line 158), and file.rs (line 30).

[P2] vortex_turboquant::initialize() is not self-contained for lazy TQUnpack file reads. TQUnpack deserialization requires the parent dtype to downcast as AnyVector, but initialize() no longer registers vortex_tensor::vector::Vector. The current file tests mask this by separately calling vortex_tensor::initialize(&session). A session that only calls the TurboQuant initializer can register TQUnpack but still fail to deserialize its Vector parent dtype. See lib.rs (line 76) and unpack.rs (line 167).

[P2] The public SORF dimension padding path uses unchecked next_power_of_two(). SorfMatrix::try_new and SorfTransform::return_dtype can panic in debug builds on oversized dimensions instead of returning VortexResult, and validate_sorf_options does not reject the bad dimension before serialized metadata reaches this path. Use checked_next_power_of_two() and reject zero/overflow in validation. See rotation.rs (line 79) and vtable.rs (line 96).

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@joseph-isaacs
Copy link
Copy Markdown
Contributor

Please can we think a little about the reviewer this is a 4k+ PR with a large move and a new feature

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an entire tests module

Copy link
Copy Markdown
Contributor

@joseph-isaacs joseph-isaacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please can this be split up

}
}

pub(super) fn serialize_config(config: &TurboQuantConfig) -> Vec<u8> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit weird these aren't on the TQScalarFnMetadata impl? Don't mind that much 🤷 - you could also just inline them to the call site they're so small

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we only use the struct as an intermediate type, these are not really methods

Comment thread vortex-turboquant/src/tests/malformed.rs Outdated
}

impl ScalarFnVTable for TQUnpack {
type Options = TurboQuantConfig;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this? Isn't this entirely encapsulated by the child's ext dtype metadata?

You should model this as e.g. options the user must provide in their SQL query SELECT tq.unpack(..., <options>)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do actually need this in order to (lossily) reconstruct the original vectors:

/// Configuration for lossy TurboQuant packing.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct TurboQuantConfig {
    bit_width: u8,
    seed: u64,
    num_rounds: u8,
}

None of this information can be taken from the dtype

let row_values = &values[i * dimensions..][..dimensions];
let norm = norm_values[i];

if !mask.value(i) || norm == T::zero() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you probably want to iterate the mask, rather than call .value(i) on every iteration

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact may even be worth a criterion benchmark and a quick skim of samply for any red falgs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well optimize this now while we are here since I can imagine a sparse column of vectors will not be encoded super well

Comment thread vortex-turboquant/src/vector/quantize.rs
Comment thread vortex-turboquant/src/vector/storage.rs
Comment thread vortex-turboquant/src/vector/unpack.rs
Copy link
Copy Markdown
Contributor

@robert3005 robert3005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use prs as atomic units instead of commits, please. Github isn't made for reviewing individual commits

@connortsui20
Copy link
Copy Markdown
Contributor Author

connortsui20 commented May 8, 2026

@joseph-isaacs how do you suggest I split this PR up?

The best alternative here that I can think of is that I split this into a) copy and paste a bunch of code from vortex-tensor, b) clean up, and c) add the scalar functions, which is frankly a waste of time if we know most of the code is going to change.

This is all due to the fact that we decided to change the design quite significantly but the implementation is still mostly the same.

Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
@connortsui20
Copy link
Copy Markdown
Contributor Author

I will clean up my git history and we can figure out if it makes sense to split changes out of this or if everything together is required for context for reviewing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants