Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion vortex-btrblocks/src/compressor/float/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -433,13 +433,14 @@ impl Scheme for DictScheme {
let has_all_values_referenced = dict.has_all_values_referenced();
let DictArrayParts { codes, values, .. } = dict.into_parts();

// TODO(connor): This should probably be extending the `excludes` list.
let compressed_codes = compressor.compress_canonical(
Canonical::Primitive(codes.to_primitive()),
ctx.descend(),
Excludes::int_only(&[IntCode::Dict, IntCode::Sequence]),
)?;

assert!(values.is_canonical());
// TODO(connor): This should probably be extending the `excludes` list.
let compressed_values = compressor.compress_canonical(
Canonical::Primitive(values.to_primitive()),
ctx.descend(),
Expand Down
25 changes: 18 additions & 7 deletions vortex-btrblocks/src/compressor/integer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -703,21 +703,32 @@ impl Scheme for DictScheme {

let dict = dictionary_encode(stats);

// Cascade the codes child
// Don't allow SequenceArray as the codes child as it merely adds extra indirection without actually compressing data.
let mut new_excludes = vec![IntCode::Dict, IntCode::Sequence];
new_excludes.extend_from_slice(excludes);
// Cascade the codes child.
// Don't allow SequenceArray as the codes child as it merely adds extra indirection without
// actually compressing data.
let mut codes_excludes = vec![IntCode::Dict, IntCode::Sequence];
codes_excludes.extend_from_slice(excludes);
Comment on lines +708 to +710
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what goes wrong with sequence array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to have duplicates in dictionary codes by definition which means the codes can never be a sequence array

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, is that documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment right above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Im working on something that will make this kind of reasoning a lot more obvious, if you are interested: #7018


let compressed_codes = compressor.compress_canonical(
Canonical::Primitive(dict.codes().to_primitive().narrow()?),
ctx.descend(),
Excludes::int_only(&new_excludes),
Excludes::int_only(&codes_excludes),
)?;

// Cascade the values child.
let mut values_excludes = vec![IntCode::Dict];
values_excludes.extend_from_slice(excludes);

let compressed_values = compressor.compress_canonical(
Canonical::Primitive(dict.values().to_primitive()),
ctx.descend(),
Excludes::int_only(&values_excludes),
)?;

// SAFETY: compressing codes does not change their values
// SAFETY: Compressing the arrays does not change their logical values.
unsafe {
Ok(
DictArray::new_unchecked(compressed_codes, dict.values().clone())
DictArray::new_unchecked(compressed_codes, compressed_values)
.set_all_values_referenced(dict.has_all_values_referenced())
.into_array(),
)
Expand Down
Loading