Structural Pruning for LLaMA
-
Updated
May 20, 2023 - Python
Structural Pruning for LLaMA
Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)
Code for paper "Accelerating Federated Learning for IoT in Big Data Analytics with Pruning, Quantization and Selective Updating"
A PyTorch implementation for structural pruning applied to neural networks during training
Multimodal-MoE-Slimming is a post-training compression framework for multimodal Mixture-of-Experts models. It analyzes how experts respond differently to visual and textual information, then uses this modality-aware behavior to reduce redundant expert capacity and build compact MoE checkpoints for more efficient deployment.
An independent reproduction of DepGraph (CVPR 2023) for ResNet-18 structural pruning. (Compression: 73.26% MACs, Accuracy: 91.69%)
Official ICML 2026 Spotlight implementation for structural MoE compression, including attribution-guided channel scoring, coverage-maximized pruning, compact checkpoint construction, and fine-tuning support.
Add a description, image, and links to the structural-pruning topic page so that developers can more easily learn about it.
To associate your repository with the structural-pruning topic, visit your repo's landing page and select "manage topics."