🌌 Lumenspark2: Next-Gen Lightweight Transformer

Lumenspark2 is the second-generation implementation of Lumenspark, a lightweight transformer for efficient large-scale language modeling. It integrates modern architectural components, Hugging Face’s training ecosystem, and parameter-efficient fine-tuning methods to provide a flexible, research-friendly framework.

🚀 Features

🔥 Modern Transformer Architecture
- Rotary Position Embeddings (RoPE)
- RMSNorm normalization
- SwiGLU feed-forward networks
- Efficient scaled-dot-product attention (SDPA)
⚡ Training Framework
- Hugging Face Trainer integration
- Streaming dataset support (FineWeb-Edu)
- Gradient accumulation & mixed precision (bf16)
- Custom callback for loss plots and inline text generation
🧩 Extensible & Modular
- LoRA adapters for efficient fine-tuning
- Dynamic sequence chunking collator
- Configurable via LumensparkConfig
📊 Evaluation & Monitoring
- Live loss plotting (training_loss_plot.png)
- Text generation evaluation during training
- Parameter counting utility

📦 Installation

Clone and install dependencies:

git clone https://github.com/anto18671/lumenspark2.git
cd lumenspark2
pip install -r requirements.txt

Dependencies:

torch
transformers
datasets
safetensors
huggingface_hub
matplotlib

🏗️ Model Architecture

Config Parameters (LumensparkConfig):
- seq_length: 1536
- d_model: 1024
- n_layers: 12
- n_heads: 16
- ffn_mult: 4.0
- dropout: 0.1
- rope_theta: 10,000
- adapter_rank: 0 (LoRA disabled by default)
Core Components:
- Token embeddings (tied with LM head)
- Transformer blocks with RMSNorm + RoPE + SDPA
- SwiGLU feed-forward networks
- Causal LM head

📝 Training

Run training with:

python train.py

Default hyperparameters:

Batch size: 8
Gradient accumulation: 20
Learning rate: 1e-4
Weight decay: 1e-2
Dataset: FineWeb-Edu (streaming)
Steps: 10,000 (via MAX_STEPS)

Training outputs:

Loss curves → training_loss_plot.png
Generated samples printed at evaluation intervals

🔍 Generation

Lumenspark2 has a built-in .generate() method supporting top-k, top-p, temperature, and repetition penalty.

from lumenspark_model import LumensparkModel, LumensparkConfig
from transformers import GPT2Tokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
config = LumensparkConfig()
model = LumensparkModel(config, tokenizer=tokenizer)

prompt = "The year is 2050, and humans have colonized Mars."
print(model.generate(prompt, max_length=64, top_p=0.9, temperature=0.7))

📊 Parameter Counting

from utils import count_parameters
count_parameters(model)

Outputs total, trainable, and non-trainable parameter counts.

📂 Project Structure

lumenspark2/
├── train.py              # Training loop with Hugging Face Trainer
├── lumenspark_model.py   # Transformer architecture, config, generate()
├── utils.py              # Helper functions (collator, parameter counting)
├── requirements.txt      # Dependencies
├── README.md             # Documentation
└── LICENSE               # MIT License

📜 License

MIT License – see LICENSE.

🙌 Acknowledgments

Hugging Face transformers & datasets
FineWeb-Edu dataset
OpenAI GPT-2 tokenizer

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
lumenspark		lumenspark
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌌 Lumenspark2: Next-Gen Lightweight Transformer

🚀 Features

📦 Installation

🏗️ Model Architecture

📝 Training

🔍 Generation

📊 Parameter Counting

📂 Project Structure

📜 License

🙌 Acknowledgments

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌌 Lumenspark2: Next-Gen Lightweight Transformer

🚀 Features

📦 Installation

🏗️ Model Architecture

📝 Training

🔍 Generation

📊 Parameter Counting

📂 Project Structure

📜 License

🙌 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages