Shaunak Rahul Mahajan portrait

Shaunak Rahul Mahajan

Python · SQL · TensorFlow · PyTorch · scikit-learn

Turning data into models and decisions

Stony Brook, NYEmailLinkedIn

About

Early-career ML professional with experience across the data/ML lifecycle: problem scoping, data wrangling (Python/SQL), exploratory analysis, feature engineering, and modeling with scikit-learn, TensorFlow, and PyTorch. Comfortable designing metrics and experiments, turning insights into dashboards or APIs, and collaborating with engineers and stakeholders to ship measurable improvements. Looking to grow in roles where I can build reliable data pipelines, optimize models for latency/throughput, and communicate results clearly.

Open to:
Machine Learning EngineerAI/ML EngineerData Scientist IData Analyst I

Technical Skills

ML Frameworks
TensorFlowPyTorchscikit-learnXGBoost
Languages
PythonSQLRJavaHTMLCSSJavaScript
MLOps & Deployment
Model ServingQATDockerAPIsStreaming Inference
Tools
GitJupyterMongoDBFirebaseAngularAndroid StudioVS Code
Core Concepts
Feature EngineeringOptimizationInference PipelinesComputer VisionReal-Time Systems

Experience

AI Software Engineer
Splashgain Technologies
Jul 2023 – Mar 2024Pune, India
Stack: TensorFlow, Python, Angular, HTML/CSS/JS
  • Hardened a production CBT proctoring pipeline (post-processing, alert routing, reviewer UX).
  • Optimized TensorFlow serving (input pipeline tuning, batching, graph tweaks) for ~40% lower prediction latency and ~15% faster end-to-end processing across 5k+ active users.
  • Integrated flags into the Eklavvya admin portal to improve reviewer throughput.
Software Intern
Splashgain Technologies
Jul 2022 – Jun 2023Pune, India
Stack: TensorFlow, EfficientDet, Python
  • Developed and deployed EfficientDet-D0 object detection for CBT cheating cues (~92% test accuracy; processed 50k+ frames/day).
  • Built streaming inference with quantization-aware training and real-time alert generation; automated flagging reduced manual invigilation by ~20 hrs/week.
  • Added telemetry and error handling for stable rollout and faster triage.
Data Analytics Intern
Mahindra & Mahindra
Aug 2021 – Sep 2021Nashik, India
Stack: Python, MongoDB
  • Analyzed 10M+ CNC logs; segmentation + anomaly scoring ≈95% accuracy.
  • MongoDB aggregations cut detection time by ~35% for faster maintenance response.

Projects

Self-Refinement Loop for LLMs
Aug 2025
LLMsEvaluationAutomation
  • Iterative self-critique with multi-critic scoring (clarity, completeness, alignment).
  • Reached ~95% near-expert quality within 3 cycles (<5 s latency).
  • 98.8% precision via threshold tuning, class balance, and ensemble critics.
Mid-Day Meal Analysis using ML (Publication)
Jan 2023
View
Public PolicyRegressionData Analysis
  • Built a state-wise data pipeline; cleaned, normalized, and imputed missing nutrition fields.
  • Ran logistic/linear regression to quantify disparities and highlight high-variance states.
  • Produced policy-facing visual summaries and age-group recommendations.
Alzheimer’s Disease Severity Detection (Publication)
Aug 2022
View
HealthcareRandom ForestFeature Engineering
  • Engineered features from clinical assessments; compared tree ensembles vs baselines.
  • Random Forest + tuned hyperparameters; reported stratified CV metrics.
  • Explained top drivers (e.g., cognitive scores) with importance analysis.
Brain Tumor Classification (Publication)
Dec 2021
View
Medical ImagingKNNML Research
  • Extracted MRI features; optimized KNN (k, distance metric) for multi-class tumors.
  • Achieved ~97.6–99% accuracy across glioma, meningioma, and pituitary classes.
  • Presented at World Conf. on Computational Intelligence; reproducible notebooks.
Automatic Rain-Shade Simulation (Arduino + Proteus)
Jun 2020
View
EmbeddedSimulationAutomation
  • Simulated an UNO + rain sensor controlling dual servos for a retractable roof.
  • Implemented control logic with hysteresis to avoid flicker on intermittent rain.
  • Documented circuit, algorithm, and test cases for home-automation use.

Education

Stony Brook University
M.S. in Data Science
Aug 2024 – May 2026 (Expected)New York, USA
Relevant Coursework
Introduction to ProbabilityData Analysis INatural Language ProcessingStatistical Learning (R)Statistical Computing (R)Large Language ModelsData Management
Vishwakarma Institute of Technology
B.Tech in Information Technology (CGPA: 8.97/10)
Aug 2019 – May 2023Pune, India

Leadership & Volunteering

Highlights
MentorshipCode ReviewsWorkshop FacilitationCommunity Outreach

I enjoy leading with clarity and empathy—codifying processes, unblocking people quickly, and turning workshops into practical takeaways.

Microsoft Learn Student Club
Android Development LeadAug 2021 – Aug 2022
  • Led the Android dev pod; ran stand-ups and reviews.
  • Hosted hands-on workshops and mentored juniors.
Uttkarsh – Social Initiative
VolunteerOct 2021
  • Made short videos teaching the 'Symmetry' topic and submitted them to the program.

Contact

Open to roles in Data Science, AI/ML Engineering, and related areas. Feel free to reach out: