[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., Pi0, Pi0.5, GR00TN1.5. Fully open-sourced.
-
Updated
Mar 21, 2026 - Python
[NeurIPS 2025] Flow x RL. "ReinFlow: Fine-tuning Flow Policy with Online Reinforcement Learning". Support VLAs e.g., Pi0, Pi0.5, GR00TN1.5. Fully open-sourced.
building a simple VLM. Implementing LlaMA-SmolLM2 from scratch + SigLip2 Vision Model. KV-Caching is supported and implemented from scratch as well
building a simple VLM. Implementing LlaMA-SmolLM2 from scratch + SigLip2 Vision Model. KV-Caching is supported and implemented from scratch as well
AI-powered jewelry design studio using fine-tuned Stable Diffusion XL + ControlNet. Generate photorealistic rings, necklaces, earrings & bracelets from text prompts with a Streamlit interface.
A Python script to analyze images generated using a LoRA (Low-Rank Adaptation) model applied at various strength levels. This tool helps determine an optimal strength for a given LoRA by evaluating image quality and similarity to control images.
Fine-tuned 3B parameters PaliGemma2 vision model on Valorant object detection improving IoU scores across all classes. Project is developed for research experimentation.
Building models from scratch and tuning pre-trained models to recognise different house cats
Multimodal Medical AI Fine-Tuned on Qwen-2.5-VL-7B with LoRA + Medical Distillation
Fine-tuning LiquidAI/LFM2-VL-1.6B in Colab (LoRA/4-bit) + dataset template + probe test.
Fine-tuning DINO object detection model on a COCO-annotated pedestrian dataset from IIT Delhi. Includes data prep, training, evaluation, and visualization scripts.
PyTorch Native finetuning of Multimodal Instruction tuned model (Gemma 3) from Google.
This repository includes of a Multi-Tag (acronyms are Multi-Task and Multi-Output as well) Image Classification on Fashion Products Images dataset on Kaggle using EfficientNetB0 with high accuracies
A fine tuned YOLO11 model up to 100 epochs. This custom dataset based fine tuned yolo11s is down streamed on the task of traffic signals detection in both images, videos. Furthermore, the model has been exported to the ONNX format as well. You may export it to your desired serialization format.
A toolkit for training and fine-tuning diffusion model LoRAs.
Add a description, image, and links to the finetuning-vision-models topic page so that developers can more easily learn about it.
To associate your repository with the finetuning-vision-models topic, visit your repo's landing page and select "manage topics."