Speculative Decoding Tutorial

A tutorial on implementing speculative decoding, an inference optimization technique for LLMs, using PyTorch and Hugging Face Transformers.

Dec 1, 2025 LLM, AI, Inference Optimization, ML

The Math Behind Online Softmax

Understanding the mathematical principles behind online softmax, an optimization technique used in Flash Attention to efficiently compute softmax in chunks.

Nov 24, 2025 LLM, AI, Kernels, GPU, ML

The One Big Beautiful Blog on Group Relative Policy Optimization (GRPO)

A step-by-step tutorial to code up your own GRPO Trainer.

Jun 4, 2025 LLM, AI, RLHF, Reasoning Models, GRPO, PPO

Do LLMs recognize Medical Definitions?

Figure: Do LLMs recognize Medical Definitions? There’s been a never-ending debate about whether LLMs understand the data they process, so much so that people have started to debate what understanding...

May 12, 2025 LLM, AI, Explaniable AI, Factual Understanding, Medical AI

Mechanistic Interpretability: What's Superposition?

Mechanistic Interpretability: What’s superposition? Mechanistic interpretability is an emerging area of research in AI focused on understanding the inner workings of neural networks. LLMs and Diffusion models have taken the...

Nov 30, 2024 LLM, AI, Explaniable AI, Mechanistic Interpretability, Superposition