Srihari Unnikrishnan
Machine Learning Researcher · Deep Learning · NLP · Systems for AI
About
I am a machine learning researcher working on the theoretical and empirical
foundations of deep learning systems, with a particular focus on transformer
architectures, interpretability, and reliable deployment of large language models.
My work spans mechanistic analysis of attention, agentic systems, and data-centric
AI, with applications in cloud systems and structured data.
Research Interests
- Transformer architectures and attention mechanisms
- Interpretability and mechanistic analysis of LLMs
- Agentic systems and tool-augmented language models
- Numerical stability, optimization, and efficient training
- Robustness, hallucination detection, and model evaluation
Experience
- AI Researcher, Microsoft Research India
- AI Research Scientist, Sony Research India
- Data Scientist, PricewaterhouseCoopers India
- Freelance Research Engineer (ML systems & applied AI)
Selected Research & Projects
-
GPT From Scratch —
A from-first-principles implementation of a GPT-style transformer to study
attention, causal masking, and training dynamics at a mechanistic level.
Emphasizes architectural constraints, numerical stability, and GPU efficiency.
Project Page ·
Code
-
TokenBurn —
A library for LLM confidence estimation and hallucination detection using
log-probabilities, perplexity, entropy, and KL divergence.
Evaluated on the HaluEval benchmark.
Code
-
TabGuard —
An agentic framework for adaptive tabular anomaly detection via dynamic
validator selection and programmatic execution.
Developed at Microsoft Research; outperforms existing benchmarks.
-
FixItFlow —
Automated troubleshooting guide generation from cloud incident logs.
Extracts diagnostic patterns from historical engineer actions and synthesizes
validated remediation guides, achieving significant reductions in mitigation time.
-
Liquidity-Aware Bond Yield Prediction —
Hybrid CausalGAN and reinforcement learning approach for synthetic bond yield
generation using macroeconomic variables.
arXiv
Education
B.Tech in Computer Science and Engineering (Artificial Intelligence & Machine Learning)
Vellore Institute of Technology, Chennai