UC Berkeley · B.A. Data Science & CS · May 2027

Arin Kadakia

//

Building intelligent systems where hard problems have real stakes

Who I Am

Focused on building things that matter — from research pipelines to production systems.

I build AI systems and full-stack platforms — spanning multi-agent pipelines, real-time data infrastructure, and production-grade machine learning. I care deeply about writing software that is reliable, scalable, and actually deployed, not just prototyped.

I'm drawn to domains where software can compress decades of progress into years — medicine, finance, and scientific research are where I think that potential is greatest. Whether that means automating complex workflows, surfacing insights from messy data, or building tools that let experts move faster, I want to be the engineer who makes it work end-to-end.

I'm at my best when the technical bar is high and the problem is worth solving. That combination is what I look for in the work I take on.

Data Structures SICP Advanced CS Data Engineering Foundations of Data Science Linear Algebra P&T of Data Science Operating Systems
🎓
UC Berkeley
B.A. Data Science & Computer Science · Class of 2027
🤖
AI Research
Multi-agent systems, LangGraph, evaluation pipelines
🚀
Impactful Work
$14M+ fraud mitigated · 10K+ concurrent users supported

Where I've Worked

A journey through research labs and high-growth companies, shipping real products.

UC Berkeley Sky Computing Lab
Agentic AI Researcher
Jan 2026 – Present
  • Architected modular multi-agent AI system using LangGraph for role-based reasoning and autonomous coordination
  • Built self-optimizing evaluation pipeline using OpenEvolve and MAST with closed-loop failure classification
  • Designed performance monitoring tracking verification pass rates, latency, and behavioral drift
Autodesk
Data Science Intern
Aug 2025 – Dec 2025
  • Designed ML-based fraud detection system mitigating $14M+ quarterly losses using behavioral signals and real-time feature pipelines
  • Engineered ensemble classification framework with XGBoost, LightGBM, and rule-based heuristics at sub-second latency
  • Containerized microservices for dynamic risk mitigation (CAPTCHA, redirect flows, multi-step verification)
Artera AI
Software Engineering Intern
Jun 2025 – Aug 2025
  • Built React + GraphQL clinical administration dashboards with 85% test coverage
  • Implemented RBAC, tabbed navigation, and real-time sync reducing task completion time by 30%+
  • Contributed to CI/CD pipelines with Docker and GitHub Actions
Securities Quote Xchange
Software Engineering Intern
Jan 2025 – May 2025
  • Built data management and visualization platform with React and Python for alternative investment custodians
  • Implemented async upload pipelines using AWS S3, FastAPI, and PostgreSQL ensuring 90% data accuracy
Nidus Technologies
Software Engineering Intern
Aug 2024 – Dec 2024
  • Delivered AI-driven full-stack platform for ParentMe supporting 10,000+ concurrent users
  • Built counseling session booking with Google Calendar API sync and Twilio SMS/email notifications

Things I've Built

Selected projects spanning ML, NLP, computer vision, and full-stack development.

💊
2025
Breaking Good

End-to-end neuropharmacology drug discovery platform for designing, simulating, and analyzing novel ADHD therapeutics. Integrates AI-powered molecular design via Claude (Anthropic), real-time 3D visualization with 3Dmol.js, molecular docking & dynamics simulations, ADMET prediction, PubMed/ChEMBL database search, and team-based collaboration tools with regulatory pathway analysis.

LLM Integration Node.js RDKit 3Dmol.js Drug Discovery Neuropharmacology
🧬
2025
BioOS

Agentic operating system for biology researchers that orchestrates AI bioscience models (AlphaFold 3, ESMFold, RFdiffusion, DiffDock) into autonomous multi-agent pipelines. Researchers input a target sequence and BioOS handles everything: protein folding → binding site prediction → ligand docking → ADMET screening → FDA-grade documentation. Every step is reproducible, auditable, and compliance-ready.

LLM Integration Next.js 14 FastAPI AlphaFold 3 Multi-Agent PostgreSQL
📚
Jun – Jul 2025
LLM-Based Book Recommendation System

Transformer-based recommendation engine with semantic vector search and LangChain orchestration. Features a Gradio dashboard with 90%+ recommendation precision, combining dense embeddings with collaborative signals for personalized results at scale.

LangChain Transformers Vector Search Gradio NLP
🏥
May 2025
Generative AI Medical Chatbot

RAG-based chatbot grounded in the Gale Encyclopedia of Medicine corpus. Combines Flask for serving, Pinecone for semantic vector retrieval, and a large language model backbone to deliver accurate, real-time medical Q&A with source attribution.

RAG Flask Pinecone LLM Healthcare AI
🔬
Feb – Aug 2023
Early Malaria Detection via CNN

U-Net convolutional neural network for automated malaria detection from microscopic blood cell images. Achieved high sensitivity for parasitized cell identification. Research presented at the Society of Robotic Surgery Conference, demonstrating real clinical potential.

CNN U-Net PyTorch Medical Imaging Computer Vision

Tech Stack

Tools and technologies I use to bring ideas to life.

Languages & Frameworks
Python JavaScript React Java C C++ SQL R Node.js GraphQL
AI / ML
TensorFlow PyTorch scikit-learn LangChain LangGraph NumPy XGBoost LightGBM
Cloud & Tools
AWS Azure Docker GitHub Actions FastAPI PostgreSQL Pinecone Git
Concepts
Deep Neural Networks RAG Multi-Agent Systems Bayesian Statistics Hyperparameter Tuning

Let's Connect

Whether you have a question, opportunity, or just want to chat about AI — my inbox is always open.