AI Engineer | LLM Systems | MLOps | Cloud-Native Architectures
AI Engineer focused on building scalable, production-grade ML systems and LLM-powered applications.
Strong experience across the full model lifecycle — from training and optimization to deployment on cloud-native microservices.
I enjoy solving real-world problems using AI systems that are reliable, measurable, and efficient.
Aarogya is an AI-driven healthcare assistant that helps users understand prescriptions and medical reports, discover generic medicine alternatives, and ask follow-up health questions — directly via WhatsApp.
- WhatsApp Business API (WABA): Scalable conversational healthcare workflows
- LLMs: Simplified prescription & medical report explanations
- Vision + OCR: Processed scanned prescriptions and lab reports using image-based LLM pipelines
- Multilingual Translation: Real-time language adaptation for regional accessibility
- Parallel Processing: Concurrent pipelines for image parsing, report analysis, and contextual Q&A
- Context Retention: Maintains session memory to support unlimited follow-up queries per report
- Reduced confusion around medical prescriptions and lab reports
- Improved affordability by surfacing generic medicine alternatives
- Delivered healthcare insights in minutes without requiring app downloads
Try Aarogya on WhatsApp:
https://wa.me/message/OMBNFBED6JCFJ1
Financial research utility that extracts management commentary and analyst Q&A from earnings call transcripts.
PyPI: https://pypi.org/project/concall-parser
Go · Python · SQL · JavaScript
PyTorch · TensorFlow · scikit-learn · Hugging Face Transformers
LangChain · LangGraph · LlamaIndex
RAG Systems · Vector Search · Prompt Engineering · Model Quantization (ONNX, INT8)
PostgreSQL · MongoDB · Pinecone · Qdrant
AWS · GCP · Docker · Kubernetes · CI/CD Pipelines · GitHub Actions
MLflow · DVC · Airflow · Model Evaluation Pipelines
FastAPI · Flask · REST Microservices · Async Processing
- Kaggle Bronze Medalist (Top 13%) – LEAP Atmospheric Physics Competition
- Built optimized quantized models with sub-millisecond inference latency
- Designed high-precision retrieval systems (RAG + semantic caching)
- Developed production LLM agents for Text2SQL and document retrieval
LinkedIn: https://www.linkedin.com/in/jay-shah
Email: jayshah0726@gmail.com

