Person Re-Identification framework — large-scale computer vision project focused on identity matching across cameras.
Full-stack task management system with Java backend and frontend.
Experiments applying diffusion models to text generation tasks.
Custom logging integration for HuggingFace training workflows.
Bike rental application with OpenStreetMap integration, FastAPI backend, DynamoDB, and Svelte frontend.
Everyday coding challenge to develop CUDA kernels from scratch.
Code for my library to create simple LLMs for research purposes.
A modular library focused on generative AI experimentation and development.
Created a simple MLP neural network using numpy
Workflow and methodology for YOLO models.
Project mostly for learning experience.I implement different RWKV architectures
Energy Based Models that I implemented based on various papers and courses/talks by different researchers
Developed several RNNs in PyTorch as a learning project, focusing on implementing mathematical formulas into models.
Built several ConvNets as foundational components, providing a great learning experience on various architectures.
Developed an image2latex model for extracting LaTeX formulas from images and an LLM for explaining mathematical concepts.
Constructed the LLama 2 architecture entirely from the ground up. Throughout this project, I gained knowledge about KV-caching and ROPE
GPT-2 based model trained on selected books from open libraries, leveraging existing GPT-2 weights for better language understanding and improved output quality.
The Stable Diffusion architecture, implemented in PyTorch, was used for inference with version 1.5 weights from Hugging Face, providing key insights into building complex models.
Text-to-Image pipeline based on CLIP architecture and used pretrained weights.
Created Transformer architecture for English-German text translation, fine-tuning with HuggingFace's open weights.
Implemented various GAN architectures including: WGAN, DCGAN, Linear GAN, C-GAN, CycleGAN, VAE Model, Pix2Pix Model for different image generation tasks.
Using VGG-16 pretrained feature maps for creating artworks through styletransfer.
Fine-tuned YOLOv8 and crafted custom multi-object detection model for specialized tasks.
PyTorch CNN with ResNet50 backbone for accurate multi-class classification.
Developed FR-CNN model for robust object detection between busses and trucks.
Implemented U-NET with VGG-16 backbone for precise street semantic segmentation.
Achieved exceptional accuracy in tumor detection using diverse backbone architectures(VGG-16, ResNet50, MobileNetV3, Inception).
topic activity across quarters · projects + blog posts
| topic | Q3'23 | Q4'23 | Q1'24 | Q2'24 | Q3'24 | Q4'24 | Q1'25 | Q2'25 | Q3'25 | Q4'25 | Q1'26 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Computer Vision | |||||||||||
| Generative AI | |||||||||||
| LLM / NLP | |||||||||||
| CUDA / Kernels | |||||||||||
| Infrastructure | |||||||||||
| Research |
peak focus —