Projects
— Built
— In Progress
RAG Document Intelligence Pipeline
Production-style retrieval-augmented generation for document Q&A. Chunks and embeds PDFs via OpenAI, stores in FAISS, serves answers through a FastAPI endpoint with citation tracking and query reranking.
LLM Fine-Tuning Pipeline with QLoRA
Fine-tuned Llama 2 on domain-specific instruction data using QLoRA and PEFT. Reduced GPU memory footprint by 60% while matching full fine-tune performance. MLflow tracking + SageMaker deployment.
ML Serving API with MLflow & Docker
End-to-end ML serving system with experiment tracking, model registry, and versioned REST endpoints. Containerized with Docker, orchestrated on Kubernetes, with automated retraining on data drift.
NLP Entity Extraction API
BERT-based NER and sentiment analysis service deployed with FastAPI and Docker. Handles multi-label classification, entity extraction, and aspect-based sentiment across customer feedback data.