Skip to content

Add endian keyword argument to type reads#144

Open
Schamper wants to merge 1 commit into
add-copyfrom
add-endian-kwarg
Open

Add endian keyword argument to type reads#144
Schamper wants to merge 1 commit into
add-copyfrom
add-endian-kwarg

Conversation

@Schamper

Copy link
Copy Markdown
Member

Partially solves #143, but I'd still like the context manager approach to be in there too. But that depends on #140.

Adds the ability to override the endianness on single type reads. So for example:

In [1]: from dissect.cstruct import cstruct

In [2]: cs = cstruct(endian="<")

In [3]: hex(cs.uint32(b"\x01\x02\x03\x04"))
Out[3]: '0x4030201'

In [4]: hex(cs.uint32(b"\x01\x02\x03\x04", endian=">"))
Out[4]: '0x1020304'

This should work on any type, including structs, unions and pointers.

Because the function signature for type reading and writing changes because of this, I also added a check when adding a custom type to ensure it's signature is correct. If it's not, it will either throw a (descriptive) error or issue a warning. I also intentionally added a **kwargs to the function signature to allow for future expansion without breaking backwards compatibility again.

I'm not 100% if the behavior of pointers is as expected. I implemented it so that:

  • If you don't override endianness, it'll take it from the cstruct instance (like everything else)
  • If you override endianness, it'll be used for both the parsing of the pointer value and the dereferencing
  • You can additionally override on .dereference(endian=)
  • As with everything else, calling .dumps() will fall back to the cstruct instance endianness, unless overridden

@codecov

codecov Bot commented Feb 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 222 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (cc3738b) to head (fd1fe6d).

Files with missing lines Patch % Lines
dissect/cstruct/types/base.py 0.00% 40 Missing ⚠️
dissect/cstruct/types/pointer.py 0.00% 40 Missing ⚠️
dissect/cstruct/cstruct.py 0.00% 33 Missing ⚠️
dissect/cstruct/types/structure.py 0.00% 24 Missing ⚠️
dissect/cstruct/utils.py 0.00% 17 Missing ⚠️
dissect/cstruct/types/wchar.py 0.00% 14 Missing ⚠️
dissect/cstruct/types/enum.py 0.00% 12 Missing ⚠️
dissect/cstruct/types/packed.py 0.00% 10 Missing ⚠️
dissect/cstruct/types/char.py 0.00% 8 Missing ⚠️
dissect/cstruct/types/int.py 0.00% 7 Missing ⚠️
... and 4 more
Additional details and impacted files
@@           Coverage Diff            @@
##           add-copy    #144   +/-   ##
========================================
  Coverage      0.00%   0.00%           
========================================
  Files            21      21           
  Lines          2500    2551   +51     
========================================
- Misses         2500    2551   +51     
Flag Coverage Δ
unittests 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Schamper Schamper force-pushed the add-endian-kwarg branch 3 times, most recently from 15ab019 to 8a6b8b6 Compare February 13, 2026 14:11
@codspeed-hq

codspeed-hq Bot commented Jun 10, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 11.41%

❌ 2 regressed benchmarks
✅ 10 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Benchmark BASE HEAD Efficiency
test_benchmark_union[compiled] 125.9 µs 142.8 µs -11.8%
test_benchmark_union[interpreted] 120 µs 134.9 µs -11.02%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing add-endian-kwarg (fd1fe6d) with add-copy (cc3738b)

Open in CodSpeed

@Schamper Schamper force-pushed the add-endian-kwarg branch 2 times, most recently from c8419f6 to df8bbb9 Compare June 11, 2026 11:07
@Schamper Schamper changed the base branch from main to add-copy June 11, 2026 11:07
@Schamper

Copy link
Copy Markdown
Member Author

I rebased on #140 and added the context manager.

@yunzheng yunzheng left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some docstring remarks, and a question about **kwargs.

Comment thread dissect/cstruct/types/base.py Outdated

def _read(cls, stream: BinaryIO, context: dict[str, Any] | None = None) -> Self: # type: ignore
def _read(
cls, stream: BinaryIO, *, context: dict[str, Any] | None = None, endian: Endianness | None = None, **kwargs

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely remember our discussion about adding **kwargs for future proofing the API (or was it backwards compatibility?).

But I kinda forgot what the use case was. I'm also not sure if this is the best solution anymore, because using **kwargs can introduce mistyped keyword arguments, eg: if someone used endain='little' it would go unnoticed.

Please help me remember :)

@Schamper Schamper Jun 11, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was for future proofing, in case we want to add additional keyword arguments in the future, it would break custom type implementors. I do agree with your point, and on the one hand we've only added one new keyword argument in 8 years time. On the other, if we do want to add more, it'd probably have to be a major version bump each time. Unless you can think of another way.

I looked a little closer, but I wanted to keep my original message too. The kwargs is only present in the _read methods, which will only ever be used/implemented by (custom) type implementations. The "end user" of cstruct will always use the non-sunder-methods read(s), write, dumps, uint32(..., endian=). In fact, for the "call the type directly" method, this is already a risk. But due to how cstruct works, there's not really anything we can change in that. For all other "public" methods though, this is not a risk, because they do not use kwargs.

So, in short, the only time someone would be dealing with these kwargs is if they're implementing their own custom type. In which case, this would correctly act as "future proofing" it against any new keyword arguments.

Comment thread dissect/cstruct/types/base.py Outdated
Comment thread dissect/cstruct/types/base.py Outdated
Comment thread dissect/cstruct/types/base.py Outdated
Comment thread dissect/cstruct/types/base.py Outdated
Comment thread dissect/cstruct/types/base.py Outdated
Comment thread dissect/cstruct/types/base.py Outdated
@yunzheng

yunzheng commented Jun 11, 2026

Copy link
Copy Markdown
Member

Some docstring remarks, and a question about **kwargs.

Oh oops, you added it in the PR message:

I also intentionally added a **kwargs to the function signature to allow for future expansion without breaking backwards compatibility again.

anyway, my comment about the risks of using **kwargs still stand. In which case would it break backwards compatibility?

@Schamper

Copy link
Copy Markdown
Member Author

Improved the docstrings and replied to your comment 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants