fix: complete Falcon model support and add Pile dataset by Medhatt21 · Pull Request #450 · ModelTC/LightCompress

Medhatt21 · 2026-03-03T13:21:03Z

Summary

Falcon model: The existing Falcon class in llmc/models/falcon.py was non-functional due to several issues. This PR fixes them so Falcon models (both old RWForCausalLM and new FalconForCausalLM architectures) work correctly with all quantization algorithms.
Pile dataset: Adds pile as a supported calibration and evaluation dataset, loading from mit-han-lab/pile-val-backup. Several quantization papers (SmoothQuant, LLM.int8()) use Pile for calibration, but it was not available as a dataset option.

Falcon Fixes

Issue	Before	After
`skip_layer_name()`	Missing (abstract) — caused instantiation failure	Returns `['lm_head']`
`find_embed_layers()`	`self.model.model.rotary_emb` (wrong path)	`self.model.transformer.rotary_emb`
`get_layers_except_blocks()`	Missing `lm_head`	Includes `lm_head`
`has_bias()`	Hardcoded `False`	Reads from `model_config.bias`
Architecture detection	`block.config.architectures[0]` string comparison (fragile, could raise on unknown arch)	`model_config.new_decoder_architecture` with `getattr` fallback
Old-arch layernorms (non-parallel)	Only returned `post_attention_layernorm`	Returns both `input_layernorm` and `post_attention_layernorm`

Pile Dataset

base_dataset.py: Added 'pile' to field map and build_calib_dataset() (loads mit-han-lab/pile-val-backup validation split)
specified_preproc.py: Added pile_gptq preprocessor (same pattern as wikitext2_gptq)
eval_base.py: Added 'pile' to supported dataset list, download logic, and tokenization
Also fixed a minor bug in eval_base.py error message: self.dataset → self.eval_dataset_name

Test plan

Verify tiiuae/falcon-7b (old arch, parallel_attn=True) loads and quantizes with RTN/GPTQ
Verify tiiuae/falcon-40b (new arch, new_decoder_architecture=True) loads and quantizes
Verify pile calibration dataset loads and preprocesses correctly
Verify pile evaluation dataset loads and computes perplexity

Made with Cursor

Falcon model: - Add missing skip_layer_name() (was abstract, caused instantiation failure) - Fix rotary_emb path: model.model.rotary_emb -> model.transformer.rotary_emb - Add lm_head to get_layers_except_blocks() - Read has_bias() from model config instead of hardcoding False - Use model_config.new_decoder_architecture instead of fragile architectures[0] check - Return both layernorms for old arch non-parallel_attn case - Use getattr with defaults for safer config attribute access Pile dataset: - Add 'pile' as calibration dataset (loads mit-han-lab/pile-val-backup) - Add pile_gptq preprocessor for GPTQ-style calibration sampling - Add 'pile' to eval_base supported datasets with data loading and encoding - Fix eval_base error message: self.dataset -> self.eval_dataset_name Tested with tiiuae/falcon-7b (old arch) and tiiuae/falcon-40b (new arch). Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: complete Falcon model support and add Pile dataset#450

fix: complete Falcon model support and add Pile dataset#450
Medhatt21 wants to merge 1 commit intoModelTC:mainfrom
Medhatt21:fix/falcon-model-and-pile-dataset

Medhatt21 commented Mar 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Medhatt21 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Falcon Fixes

Pile Dataset

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Medhatt21 commented Mar 3, 2026 •

edited

Loading