Generative AI, LLM, RAG Systems
- System Reliability & Performance: Led major initiative to stabilize and accelerate uGPT imports, achieving 3x faster processing and 2x load capacity. Reduced retrieval latency by up to 75% (2 KBs) to 10x (10-30 KBs) through optimizations including removal of cosine similarity operations, query optimization, and caching strategies. Responded to critical incidents and eliminated recurring import failures.
- Code Quality & Infrastructure: Introduced comprehensive code quality initiatives including pre-commit hooks, PyTest framework, MyPy type checking, and linting/formatting checks in CI (adopted across teams). Optimized CI/CD workflows, reducing Docker image size from 4.1GB to 370MB and build times from 20-25 minutes to 3-6 minutes through migration to uv package manager and workflow consolidation.
- Feature Development & Model Rollouts: Fixed critical chunking bugs that increased Bot understood rate and decreased "not understood" responses. Supported custom instructions implementation and coordinated GPT-4o A/B testing with prompt migration. Led Text-Embedding-3-Large rollout resulting in higher Bot understood rate and lower escalation rate. Added batching for embedding computation achieving 2x faster imports.
- Monitoring & Observability: Built comprehensive monitoring infrastructure using Datadog, Sentry, and Grafana with dashboards, alerts, and Prometheus metrics for imports and latency. Refined Sentry alert rules to eliminate false positives and accelerate triage. Established monitoring as reference implementation for other teams.
- Infrastructure & Scalability: Drove ZOS migration (85% complete) and contributed to internal libraries (language-utils, db-utils, service-utils). Identified major scalability risks and designed mitigation strategy for adding OpenSearch cluster for new customers. Prevented excessive sharding through automated cleanup of orphaned indexes, reducing costs and improving cluster health. Removed secondary chunks and embeddings, reducing OpenSearch storage by 38% and improving import speed by 26%.
- Documentation & Knowledge Sharing: Authored comprehensive documentation on indexing/chunking end-to-end, staged releases, A/B testing setup, and import debugging guides. Updated AI/ML onboarding documentation. Delivered multiple knowledge transfer sessions to different teams on import processes, A/B testing and others.
- Collaboration & Leadership: Resolved high volume of support tickets ensuring smooth operations. Evaluated 20+ coding assignments and led/assisted in 10+ technical interviews for hiring across different internal Teams. Worked closely with cross-functional teams including research scientists, product managers, and engineers to deliver production-ready AI solutions.