Henil Gajjar

GIthub

App Link

AppSageAI 2.0

Privacy-First AI Resume Analyzer

GIthub

App Link

Technologies Used: RAG, LangChain, Gemini APIs, Explainable AI, Guardrails, CI/CD

Integrated Gemini 2.5 Pro (VertexAI) with RAG pipeline (FAISS + HF embedding) to handle 20K+ daily chats with chat persistence
Architected Full-stack application (TypeScript + FastAPI), with SSE streaming, JWT auth, and automated CI/CD on Cloud Run
Encrypted user data with AES-256 and rotating keys (GDPR compliant) in Firebase while maintaining <1s p99 TTFT for 600+ DAU

AppSageAI 2.0

Privacy-First AI Resume Analyzer

GIthub

App Link

AppSageAI 2.0

Privacy-First AI Resume Analyzer

GIthub

App Link

Technologies Used: RAG, LangChain, Gemini APIs, Explainable AI, Guardrails, CI/CD

Integrated Gemini 2.5 Pro (VertexAI) with RAG pipeline (FAISS + HF embedding) to handle 20K+ daily chats with chat persistence
Architected Full-stack application (TypeScript + FastAPI), with SSE streaming, JWT auth, and automated CI/CD on Cloud Run
Encrypted user data with AES-256 and rotating keys (GDPR compliant) in Firebase while maintaining <1s p99 TTFT for 600+ DAU

AppSageAI 2.0

Privacy-First AI Resume Analyzer

GIthub

App Link

AppSageAI 2.0

Privacy-First AI Resume Analyzer

GIthub

App Link

Technologies Used: RAG, LangChain, Gemini APIs, Explainable AI, Guardrails, CI/CD

Integrated Gemini 2.5 Pro (VertexAI) with RAG pipeline (FAISS + HF embedding) to handle 20K+ daily chats with chat persistence
Architected Full-stack application (TypeScript + FastAPI), with SSE streaming, JWT auth, and automated CI/CD on Cloud Run
Encrypted user data with AES-256 and rotating keys (GDPR compliant) in Firebase while maintaining <1s p99 TTFT for 600+ DAU

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Technologies Used: Distributed Systems/Training, CUDA, Fine-tuning, vLLM, Ollama

Fine-tuned Flux.1-dev on 1,646 WhatsApp emojis using LoRA with DeepSpeed Zero-2 on 2 Nvidia A100 GPUs, cutting model loss to 0.245
Designed Prompt Assist (2-shot Prompting + Llama 3.3) to refine user inputs before fine-tuned model, improving emoji quality
Boosted performance by 40% via GPU-accelerated inference (vLLM/MLX + Diffusers/Mflux + Streamlit) and Int8 Quantization

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Technologies Used: Distributed Systems/Training, CUDA, Fine-tuning, vLLM, Ollama

Fine-tuned Flux.1-dev on 1,646 WhatsApp emojis using LoRA with DeepSpeed Zero-2 on 2 Nvidia A100 GPUs, cutting model loss to 0.245
Designed Prompt Assist (2-shot Prompting + Llama 3.3) to refine user inputs before fine-tuned model, improving emoji quality
Boosted performance by 40% via GPU-accelerated inference (vLLM/MLX + Diffusers/Mflux + Streamlit) and Int8 Quantization

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Freemoji

Apple Genmoji for WhatsApp

GIthub

Medium

Technologies Used: Distributed Systems/Training, CUDA, Fine-tuning, vLLM, Ollama

Fine-tuned Flux.1-dev on 1,646 WhatsApp emojis using LoRA with DeepSpeed Zero-2 on 2 Nvidia A100 GPUs, cutting model loss to 0.245
Designed Prompt Assist (2-shot Prompting + Llama 3.3) to refine user inputs before fine-tuned model, improving emoji quality
Boosted performance by 40% via GPU-accelerated inference (vLLM/MLX + Diffusers/Mflux + Streamlit) and Int8 Quantization

Verta

A Personal Shopping Copilot

GIthub

Verta

A Personal Shopping Copilot

GIthub

Technologies Used: HuggingFace, OpenAI APIs, Microservices, LLM Optimization, BigData

Architected a Chatbot with an end-to-end multi-agent RAG leveraging 4 LLMs using LangGraph, Guardrails, FAISS, and BigQuery
Built the backend with FastAPI, deployed on Google Cloud Run (Docker + Git Actions CI/CD), capable of handling 20k queries/sec
Evaluated with DeepEval, achieving Correctness 0.81 and Answer Relevancy 0.87 for high-quality, relevant responses
Tracked experiments with MLflow and monitored traces with LangSmith, improving model performance via human-feedback loops
Optimized state management via layered Redis caching, enabling 3× faster responses (1.5s p95 E2E and 250 ms p95 TTFT) under 6× load

Verta

A Personal Shopping Copilot

GIthub

Verta

A Personal Shopping Copilot

GIthub

Technologies Used: HuggingFace, OpenAI APIs, Microservices, LLM Optimization, BigData

Architected a Chatbot with an end-to-end multi-agent RAG leveraging 4 LLMs using LangGraph, Guardrails, FAISS, and BigQuery
Built the backend with FastAPI, deployed on Google Cloud Run (Docker + Git Actions CI/CD), capable of handling 20k queries/sec
Evaluated with DeepEval, achieving Correctness 0.81 and Answer Relevancy 0.87 for high-quality, relevant responses
Tracked experiments with MLflow and monitored traces with LangSmith, improving model performance via human-feedback loops
Optimized state management via layered Redis caching, enabling 3× faster responses (1.5s p95 E2E and 250 ms p95 TTFT) under 6× load

Verta

A Personal Shopping Copilot

GIthub

Verta

A Personal Shopping Copilot

GIthub

Technologies Used: HuggingFace, OpenAI APIs, Microservices, LLM Optimization, BigData

Architected a Chatbot with an end-to-end multi-agent RAG leveraging 4 LLMs using LangGraph, Guardrails, FAISS, and BigQuery
Built the backend with FastAPI, deployed on Google Cloud Run (Docker + Git Actions CI/CD), capable of handling 20k queries/sec
Evaluated with DeepEval, achieving Correctness 0.81 and Answer Relevancy 0.87 for high-quality, relevant responses
Tracked experiments with MLflow and monitored traces with LangSmith, improving model performance via human-feedback loops
Optimized state management via layered Redis caching, enabling 3× faster responses (1.5s p95 E2E and 250 ms p95 TTFT) under 6× load

End-to-End Kidney Tumor Classification

GIthub

Technologies Used: CNN, Clinical Data, Compute Engine, Docker, Airflow, CI/CD

Built an end-to-end image classification pipeline using a modified VGG16 CNN model using PyTorch, achieving an F1 score of 0.96
Developed interpretable ML workflows with feature importance analysis, SHAP-based insights, and calibration for bias reduction
Automated data ingestion, model training, and evaluation pipelines using Airflow, MLflow, and DVC for reproducibility and tracking
Containerized and deployed the trained model as a Flask web app on AWS EC2 via Docker and Git Actions for real-time inference

End-to-End Kidney Tumor Classification

GIthub

Technologies Used: CNN, Clinical Data, Compute Engine, Docker, Airflow, CI/CD

Built an end-to-end image classification pipeline using a modified VGG16 CNN model using PyTorch, achieving an F1 score of 0.96
Developed interpretable ML workflows with feature importance analysis, SHAP-based insights, and calibration for bias reduction
Automated data ingestion, model training, and evaluation pipelines using Airflow, MLflow, and DVC for reproducibility and tracking
Containerized and deployed the trained model as a Flask web app on AWS EC2 via Docker and Git Actions for real-time inference

End-to-End Kidney Tumor Classification

GIthub

Technologies Used: CNN, Clinical Data, Compute Engine, Docker, Airflow, CI/CD

Built an end-to-end image classification pipeline using a modified VGG16 CNN model using PyTorch, achieving an F1 score of 0.96
Developed interpretable ML workflows with feature importance analysis, SHAP-based insights, and calibration for bias reduction
Automated data ingestion, model training, and evaluation pipelines using Airflow, MLflow, and DVC for reproducibility and tracking
Containerized and deployed the trained model as a Flask web app on AWS EC2 via Docker and Git Actions for real-time inference

Self-Driving Solar Vehicle

Technologies Used: TensorFlow, Image Processing, OpenCV, Computer Vision, CNN

Built a track-following autonomous solar vehicle by identifying 2 types of cones using OpenCV and a custom CNN model
Improved model mAP to 0.93 by using data augmentation (daylight shifts and motion blurs) and K-Means for anchor-box selection
Deployed on-edge to Nvidia Jetson Nano and integrated STM32 microcontroller using custom low-level drivers (Embedded C) for 24 FPS real-time steering with a patented steering system
Optimized end-to-end inference via pruning and quantization, achieving 120ms latency and 96.7% autonomous navigation accuracy

Self-Driving Solar Vehicle

Technologies Used: TensorFlow, Image Processing, OpenCV, Computer Vision, CNN

Built a track-following autonomous solar vehicle by identifying 2 types of cones using OpenCV and a custom CNN model
Improved model mAP to 0.93 by using data augmentation (daylight shifts and motion blurs) and K-Means for anchor-box selection
Deployed on-edge to Nvidia Jetson Nano and integrated STM32 microcontroller using custom low-level drivers (Embedded C) for 24 FPS real-time steering with a patented steering system
Optimized end-to-end inference via pruning and quantization, achieving 120ms latency and 96.7% autonomous navigation accuracy

Self-Driving Solar Vehicle

Technologies Used: TensorFlow, Image Processing, OpenCV, Computer Vision, CNN

Built a track-following autonomous solar vehicle by identifying 2 types of cones using OpenCV and a custom CNN model
Improved model mAP to 0.93 by using data augmentation (daylight shifts and motion blurs) and K-Means for anchor-box selection
Deployed on-edge to Nvidia Jetson Nano and integrated STM32 microcontroller using custom low-level drivers (Embedded C) for 24 FPS real-time steering with a patented steering system
Optimized end-to-end inference via pruning and quantization, achieving 120ms latency and 96.7% autonomous navigation accuracy

Fantasy Team Recommendation for IPL 2024

GIthub

Technologies Used: Fine-tuning, QLoRA, Prompt Engineering, NLTK, Transformers

Fine-Tuned Gemma 2 using QLoRA on cricket dataset resulting in 10% improvement in ROUGE score for cricket-specific text generation tasks
Established algorithm using NLTK to extract structured data from unstructured IPL historical data including player stats and match scorecards
Leveraged Prompt engineering (2-shot) to enhance prediction accuracy resulting in 85% accurate team prediction for the IPL matches

Fantasy Team Recommendation for IPL 2024

GIthub

Technologies Used: Fine-tuning, QLoRA, Prompt Engineering, NLTK, Transformers

Fine-Tuned Gemma 2 using QLoRA on cricket dataset resulting in 10% improvement in ROUGE score for cricket-specific text generation tasks
Established algorithm using NLTK to extract structured data from unstructured IPL historical data including player stats and match scorecards
Leveraged Prompt engineering (2-shot) to enhance prediction accuracy resulting in 85% accurate team prediction for the IPL matches

Fantasy Team Recommendation for IPL 2024

GIthub

Technologies Used: Fine-tuning, QLoRA, Prompt Engineering, NLTK, Transformers

Fine-Tuned Gemma 2 using QLoRA on cricket dataset resulting in 10% improvement in ROUGE score for cricket-specific text generation tasks
Established algorithm using NLTK to extract structured data from unstructured IPL historical data including player stats and match scorecards
Leveraged Prompt engineering (2-shot) to enhance prediction accuracy resulting in 85% accurate team prediction for the IPL matches

Rent the Runway Fashion Recommender System

GIthub

Technologies Used: Collaborative filtering, SVD, Cosine Matrix, Web Scraping

Scraped product data and user reviews via BeautifulSoup, creating structured datasets with detailed attributes and feedback metrics
Engineered preprocessing pipelines for text cleaning, tokenization, and feature extraction to enhance review relevance
Developed hybrid recommendation using matrix factorization for collaborative and cosine similarity for content-based filtering
Incorporated incremental SVD updates to address cold-start users and products, cutting system update time by 40%

Rent the Runway Fashion Recommender System

GIthub

Technologies Used: Collaborative filtering, SVD, Cosine Matrix, Web Scraping

Scraped product data and user reviews via BeautifulSoup, creating structured datasets with detailed attributes and feedback metrics
Engineered preprocessing pipelines for text cleaning, tokenization, and feature extraction to enhance review relevance
Developed hybrid recommendation using matrix factorization for collaborative and cosine similarity for content-based filtering
Incorporated incremental SVD updates to address cold-start users and products, cutting system update time by 40%

Rent the Runway Fashion Recommender System

GIthub

Technologies Used: Collaborative filtering, SVD, Cosine Matrix, Web Scraping

Scraped product data and user reviews via BeautifulSoup, creating structured datasets with detailed attributes and feedback metrics
Engineered preprocessing pipelines for text cleaning, tokenization, and feature extraction to enhance review relevance
Developed hybrid recommendation using matrix factorization for collaborative and cosine similarity for content-based filtering
Incorporated incremental SVD updates to address cold-start users and products, cutting system update time by 40%

Advanced Performance Metrics for Ultimate Frisbee Athletes

Technologies Used: Predictive Modeling, Bayesian analysis, ML Pipelines

Engineered player rating models by combining XGBoost and mixed-effects linear modeling for context-aware performance evaluation
Designed on/off plus-minus models with Bayesian modeling, increasing predictive accuracy 20% for offensive and defensive impact
Developed composite scoring metrics integrating multiple player features, improving ranking robustness by 15%
Built reproducible pipelines for model training and validation using DVC and MLFlow, ensuring scalability and consistent evaluation

Advanced Performance Metrics for Ultimate Frisbee Athletes

Technologies Used: Predictive Modeling, Bayesian analysis, ML Pipelines

Engineered player rating models by combining XGBoost and mixed-effects linear modeling for context-aware performance evaluation
Designed on/off plus-minus models with Bayesian modeling, increasing predictive accuracy 20% for offensive and defensive impact
Developed composite scoring metrics integrating multiple player features, improving ranking robustness by 15%
Built reproducible pipelines for model training and validation using DVC and MLFlow, ensuring scalability and consistent evaluation

Advanced Performance Metrics for Ultimate Frisbee Athletes

Technologies Used: Predictive Modeling, Bayesian analysis, ML Pipelines

Engineered player rating models by combining XGBoost and mixed-effects linear modeling for context-aware performance evaluation
Designed on/off plus-minus models with Bayesian modeling, increasing predictive accuracy 20% for offensive and defensive impact
Developed composite scoring metrics integrating multiple player features, improving ranking robustness by 15%
Built reproducible pipelines for model training and validation using DVC and MLFlow, ensuring scalability and consistent evaluation

End-to-End SQL Database Chatbot

GIthub

Technologies Used: Agent, ORM, SQL, LangChain, Groq

Developed an AI Agent using Langchain, SQL, and Gemma 2, allowing users to query databases with Natural Language Queries
Leveraged SQLAlchemy ORM and custom connection handling to support both SQLite and MySQL databases dynamically
Created SQL Database Toolkit with Gemma 2 zero-shot question answering, enhancing the chatbot's ability to handle complex SQL queries accurately

End-to-End SQL Database Chatbot

GIthub

Technologies Used: Agent, ORM, SQL, LangChain, Groq

Developed an AI Agent using Langchain, SQL, and Gemma 2, allowing users to query databases with Natural Language Queries
Leveraged SQLAlchemy ORM and custom connection handling to support both SQLite and MySQL databases dynamically
Created SQL Database Toolkit with Gemma 2 zero-shot question answering, enhancing the chatbot's ability to handle complex SQL queries accurately

End-to-End SQL Database Chatbot