Machine Learning Research Scientist / Data Scientist

20 years of research experience in machine learning and physics.

20
Years of
Research
€1.3M+
Competitive
Funding
50k+
Technical
Impressions
Suchita Kulkarni

About

Hands-on machine learning research scientist specializing in surrogate modeling, physics-inspired machine learning for high-dimensional, sparse-data problems.

Built the core algorithm of SModelS (adopted by 20+ international research groups).

Debugged a field-wide error in dark matter simulations, and designed adaptive MCMC pipelines that cut runtime by 10x.

Delivered 30+ international keynotes talks and 60+ particle physics papers in top-tier journals.

20
Years spanning ML and physics
€1.3M+
Competitive funding managed across EU/international research programs
50+
Researchers coordinated for the US decadal physics planning process report
75+
New-physics searches covered in the SModelS v2 codebase

Experience

2025 – Present
Machine Learning Research Scientist / Data Scientist
Graz, Austria
  • Developed unsupervised anomaly classification method using Physics-Informed LSTM autoencoder
  • Public technical writing reaching 50k+ impressions across a 10-part series
  • Regular invited speaker at ML meetups and international workshops
2011 – present
Postdoctoral Researcher — Particle Physics & ML
Austria · France · Germany
  • Lead architect of SModelS v2 — widely adopted Python toolkit covering 75+ LHC new-physics searches
  • Managed €1.3M+ in competitive EU research funding across interdisciplinary teams
  • Coordinated 50+ researchers for the Snowmass 2021 Dark Showers community report
  • Long-lived particle searches at the LHC; influenced experimental parameter choices
2007 – 2011
PhD — Particle Physics
Europe
  • Computational high-energy physics: Python-based simulation, analysis, and visualization
  • Foundational training in statistical modeling, Bayesian inference, and large-scale dataset analysis

Tech Stack

ML & Deep Learning
PyTorch scikit-learn XGBoost Optuna HDBSCAN
Data & Scientific
pandas NumPy SciPy SModelS
Deployment & Viz
Streamlit Matplotlib GitHub Pages

Selected Projects

2D diagnostic landscape: physics-informed vs standard LSTM
Physics-Aware LSTM for Anomaly Classification
77% detection rate 61% classification accuracy ARI 0.42 vs 0.31

2D diagnostic landscape combining reconstruction and physics loss to classify 9 anomaly types without labeled anomalies. Physics-informed kNN outperforms standard baseline by 21 pp on detection rate. Decision framework documented in a 10-part public series reaching 50k+ impressions.

PyTorch scikit-learn NumPy Streamlit
RUL prediction business impact summary
Remaining Useful Life Prediction — NASA Turbofan
RMSE 14–16 (↓ from 18–20) >50% maintenance cost reduction

Physics-grounded feature engineering — condition-normalized sensors (KMeans, 6 clusters), rolling statistics, monotone RUL constraint — combined with XGBoost + Optuna tuning. Uncertainty via split conformal prediction (90% coverage). Evaluated on cost-based metrics across all 4 CMAPSS datasets.

XGBoost Optuna scikit-learn pandas Streamlit
Ramachandran compliance: physics VAE vs standard VAE
Ramachandran Physics-Informed VAE
0.82 Lovell compliance Cohen's d = 0.905 All 3 islands recovered

Standard VAE collapses to 1D on imbalanced protein data; a differentiable PyTorch GMM penalty recovers all three Ramachandran islands. Latent perturbation across 6 sigma levels confirms 100% win rate in phi stability. Dataset: 1,333–3,335 samples from 5 structurally diverse PDB proteins.

PyTorch Biopython scikit-learn Streamlit

Talks & Community

Writing

Technical Series · 50k+ impressions

End-to-end technical walkthrough of building a physics-informed LSTM for anomaly classification: architecture choices, the 2D diagnostic loss landscape, and evaluation across 9 anomaly types.

Read on LinkedIn
PM Perspective

What the anomaly detection project looks like from a product angle: the decisions made, what the performance improvement actually means operationally, and how physics constraints change the conversation about model trust.

Read on LinkedIn
Creative ML

Behind the scenes of creating an AI-generated theme song for TEDxGraz: the process, the tools, and what building something creative with ML teaches about its actual capabilities and limits.

Read on LinkedIn

Selected Publications

Python-based simulation and visualization for search optimization. Influenced parameter choices and experimental search strategies.
Coordinated 50+ researchers across theory and experiment. Shaped field-level synthesis and research directions.
Lead architect of Python codebase covering 100+ new-physics searches. Accelerated interpretation of collider constraints on dark matter models.

Get in Touch

pen to research leadership roles in scientific ML, physics-informed AI, and computational physical systems. Also available for collaborations and speaking invitations.