-
Notifications
You must be signed in to change notification settings - Fork 581
[Bug Report] Fix dense component initialization for Pythia hook shapes #1326
Copy link
Copy link
Open
Labels
TransformerBridgeBug specific to the new TransformerBridge systemBug specific to the new TransformerBridge systembugSomething isn't workingSomething isn't workingcomplexity-moderateModerately complicated issues for people who have intermediate experience with the codeModerately complicated issues for people who have intermediate experience with the codegood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Metadata
Metadata
Assignees
Labels
TransformerBridgeBug specific to the new TransformerBridge systemBug specific to the new TransformerBridge systembugSomething isn't workingSomething isn't workingcomplexity-moderateModerately complicated issues for people who have intermediate experience with the codeModerately complicated issues for people who have intermediate experience with the codegood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
test_transformer_bridge_hook_shapesis parameterized over multiple models, but unconditionally skips Pythia at the top of the test body due to a known initialization issue with the dense component.Affected test
tests/integration/test_hook_shape_compatibility.py:127has inlinepytest.skip("Pythia architecture needs dense component initialization fix")What the test verifies
That all hooks fire with the expected tensor shapes (matching
HookedTransformer's contract) on bridge models.Acceptance criteria
Where to start
JointQKVPositionEmbeddingsAttentionBridge(rotary + joint QKV; same family as GPT-NeoX). Look attransformer_lens/model_bridge/supported_architectures/pythia.pyand the denseocomponent construction in the adapter's_get_blocks().