Projects
Live Building
RAG Document Intelligence Pipeline
Production-style retrieval-augmented generation system for document Q&A. Chunks and embeds PDFs using OpenAI embeddings, stores in FAISS, and serves answers via a FastAPI endpoint with citation tracking and query reranking.
LLM Fine-Tuning Pipeline with QLoRA
Fine-tuned Llama 2 on domain-specific instruction data using QLoRA and PEFT. Reduced GPU memory footprint by 60% while matching full fine-tune performance. Tracked experiments with MLflow and deployed via SageMaker.
ML Serving API with MLflow & Docker
End-to-end ML serving system with experiment tracking, model registry, and versioned REST API endpoints. Containerized with Docker, orchestrated on Kubernetes, with automated retraining triggers on data drift.
NLP Entity Extraction API
BERT-based named entity recognition and sentiment analysis service deployed with FastAPI and Docker. Handles multi-label text classification, entity extraction, and aspect-based sentiment across customer feedback data.