$ whoami --verbose|
I'm Muhammad Ahmed Ghani — Machine Learning Lead at ImagineArt (formerly Vyro), where I build the models behind one of the fastest-growing creative AI platforms in the world. I work across Speech, Vision, NLP, and Agentic AI — not as a timeline of past phases, but as four threads I'm actively pulling on right now. The interesting problems sit at the seams between them, and that's where I spend most of my time. Research is the starting line, not the trophy. Anyone can make a model produce one good output; making it produce ten million is a different discipline. That's where I live. |
role: ML Lead @ ImagineArt
(formerly Vyro)
focus: GenAI · CV · Agents
location: Islamabad | Lahore 🇵🇰
education: BS CS, UCP
building: ImagineArt
coffee: ████████░░ 80% |
$ cat ./domains.log|
Recognition, separation, and synthesis pipelines. Turning messy audio into structured signal — and back again — at production latency. |
Detection, segmentation, generation, and editing. From frame-perfect video pipelines to real-time inference on tight budgets. |
Language understanding, retrieval, and generation. Grounding LLMs in real data and real constraints, not just vibes. |
Planning, tool-use, memory, and recovery. Systems that do the work — not just describe how they'd do it if asked nicely. |
|
Creative AI platform spanning image and video generation, serving millions of creators worldwide. Led the ML stack behind the 2.0 release — unifying modalities, tightening consistency, and shipping at scale. |
Led the release of seven video-AI models at ImagineArt — pushing frame-to-frame consistency from "demo quality" to something creators can actually build with. |
|
Voice separation, video object removal pipelines, super-resolution ports, and inference tooling — quietly powering other people's projects on GitHub and Hugging Face. |
Production-grade GenAI deployments in regulated industries — real customers, real compliance, real stakes. The kind of work that teaches you what "production" actually means. |
$ ls -la ./arsenal⌁ the full stack, unabbreviated
| Domain | Focus |
|---|---|
| Speech | ASR · TTS · separation · voice cloning · audio pipelines |
| Vision | detection · segmentation · generation · editing · super-resolution |
| NLP | LLMs · RAG · fine-tuning · evaluation · prompt engineering |
| Agentic AI | planning · tool-use · multi-agent orchestration · memory · MCP |
| Languages | Python · C · C++ · JavaScript · Bash |
| Frameworks | PyTorch · TensorFlow · 🤗 Transformers · FastAPI · Gradio |
| Infra | ONNX · Docker · NVIDIA stack · AWS · GCP · HF Inference Endpoints |
| Data | MongoDB · MySQL · SQLite · Pandas · NumPy |
$ cat ./philosophy.mdModels are the easy part. Systems are the job.
$ ./connect.sh"Any sufficiently advanced neglect of monitoring is indistinguishable from magic — until it isn't."





