Transformer 5 The MathemaTricks behind FlashAttention Apr 12, 2026 The lore behind LoRA Mar 23, 2026 Exploring the Mixture of Experts Feb 24, 2026 Understanding multi-GPU Parallelism paradigms Jul 6, 2025 Attention and Transformer Imagined Jun 14, 2025