Bhooyas Kapadia

Part-Time Engineer 🛠️, Full-Time Experimenter🖥️

About Me

Hey, I’m Bhooyas — a curious human who makes models think and pixels behave. Whether I’m training transformers on a tight GPU budget or squeezing magic out of minimal resources, I blend code and creativity to build things that (usually) work and (sometimes) impress. I’m a passionate technologist who loves building smart, impactful solutions — from AI models and weird little tools to clean, intuitive web interfaces. When I’m not coding, I’m probably overengineering a side project, diving headfirst into new frameworks, or automating something that didn’t really need automating. Basically, if it involves logic, layers, or late-night debugging, I’m in.

Projects

Successfully fine-tuned TinyLlama (1.1B parameters), a lightweight language model, using LoRA adaptation technique on the Databricks Dolly 15k dataset. The implementation focused on enhancing the model’s ability to understand and respond to context-based instructions while maintaining computational efficiency. This optimization resulted in improved performance for instruction-following tasks while keeping resource requirements minimal.

Designed and implemented a custom transformer architecture from scratch using PyTorch, resulting in StoryNet—a specialized neural network for creative text generation. Successfully trained the model to generate coherent short stories, demonstrating expertise in transformer architecture, sequence modeling, and natural language generation.

Implemented a Conditional Variational Autoencoder (ConditionalVAE) that generates handwritten digits based on specified class inputs, trained on the MNIST dataset. Demonstrated the ability to control the generation process through conditional parameters, highlighting expertise in deep generative modeling, latent space manipulation, and complex neural architecture design for controlled image synthesis.

Engineered AvatarGAN, an advanced image generation pipeline combining Deep Convolutional GAN (DCGAN) with Super-Resolution GAN (SRGAN) to create high-quality game character avatars. The system generates characters at multiple resolutions, with DCGAN handling initial character creation and SRGAN enhancing image quality through upscaling.

Experience
  • Developed an automated benchmarking script for evaluating model performance across FastAPI, MLServer, and Triton, supporting multiple frameworks such as PyTorch, TensorFlow, and scikit-learn.
  • Developing Argo workflow template for automating batch model deployment, streamlining the deployment process.
  • Containerized ml applications and deployed them on Kubernetes using helm charts on different cloud providers as well as on prem ecosystem.
  • Deployed a LLM model on Triton using Nemo Inference Microservice resulting in reducing latency by approximately 20%. Additionally, collaborated closely on deploying a RAG pipeline for suggesting drinks from the menu.
  • Worked closely in multiple workshops to optimize training time using technologies like Distributed Data Parallelism, Model Parallelism, Slurm and Enroot. The efforts resulted in reducing the training time from 30%-50% based on the use case.
  • Led the setup of a DGX Cluster comprising multiple DGX nodes and a SuperMicro as the headnode using Base Command Manager.
  • Converted models to TensorRT engines and deployed them on Triton Inference Server to server 10K users per day with latency of 5 secs per query.
  • Collaborated closely to mitigate VM vulnerabilities and keep the cost under track for GCP.
  • Additionally, I have utilized Prometheus and Grafana to monitor Docker utilization. I have also written a Bash script, resulting in a 30% reduction in effort.
  • Set up and managed multiple Kubernetes clusters with NGINX Ingress Controller to streamline external access to services, enhancing scalability and load balancing. Implemented Horizontal Pod Autoscaling (HPA) to support applications serving up to 100K users per day.
  • Additionally, responsible for scoping new projects, defining deliverables, and establishing timelines to ensure alignment with business objectives and resource availability
Skills
GCP, AWS, AZURE, OCI, On Prem (DGX).
Docker, Kubernetes, Helm, Slurm, Base Command Manager, Python, Keras, Pytorch, ONNX, LoRA, GenAI, Distributed Data Parallelism, Triton, TensorRT, Large Language Models, Tensort-LLM, Github Workflow, Azure Devops, Netron, Prometheus, Grafana, Terraform, MLFlow.
Bash, HTML, CSS, JavaScript, Flask, MySQL, Pandas.
Reach Out

Can connect with me at any of the follwoing: -