Transformer 5 The MathemaTricks behind FlashAttention Apr 12, 2026 The lore behind LoRA Mar 23, 2026 Exploring the Mixture of Experts Feb 24, 2026 Understanding multi GPU Parallelism paradigms Jul 6, 2025 Attention and Transformer Imagined Jun 14, 2025