Datta's Blog

https://datta0.github.io/Datta's BlogA minimal, responsive and feature-rich Jekyll Chirpy theme for technical writing. 2026-04-14T11:24:48+05:30 Datta Nimmaturi https://datta0.github.io/ Jekyll © 2026 Datta Nimmaturi /assets/img/favicons/favicon.ico /assets/img/favicons/favicon-96x96.png The MathemaTricks behind FlashAttention2026-04-12T14:30:00+05:30 2026-04-14T11:24:23+05:30 https://datta0.github.io/posts/flash-attention/ datta0

Fast and memory efficient exact attention

The lore behind LoRA2026-03-23T14:30:00+05:30 2026-04-13T12:40:34+05:30 https://datta0.github.io/posts/the-lore-behind-lora/ datta0

LoRA imagined from the ground up

Exploring the Mixture of Experts2026-02-24T14:30:00+05:30 2026-04-13T12:40:34+05:30 https://datta0.github.io/posts/exploring-the-moe/ datta0

An intuitive build up to Mixture of Experts

Understanding multi GPU Parallelism paradigms2025-07-06T16:33:31+05:30 2026-02-24T21:37:53+05:30 https://datta0.github.io/posts/understanding-multi-gpu-parallelism-paradigms/ datta0

We’ve been talking about Transformers all this while. But how do we get the most out of our hardware? There are two different paradigms that we can talk about here. One case where your model happily fits on one GPU but you have many GPUs at your disposal and you want to save time by distributing the workload across multiple GPUs. Another case is where your workload doesn’t even fit entirely on ...

Attention and Transformer Imagined2025-06-14T14:30:00+05:30 2025-06-15T20:19:27+05:30 https://datta0.github.io/posts/transformer-imagined/ datta0

An intuitive build up to Attention and Transformer