Archives 2026 25 Mar Gated Delta Net Attention: A Deep Dive into the Linear Attention Mechanism Powering Qwen3.52025 01 Dec Speculative Decoding Tutorial 24 Nov The Math Behind Online Softmax 04 Jun The One Big Beautiful Blog on Group Relative Policy Optimization (GRPO) 12 May Do LLMs recognize Medical Definitions?2024 30 Nov Mechanistic Interpretability: What's Superposition?