The Context Window

I learn AI the messy way: papers, YouTube lectures, blog posts, Twitter threads. I go through all of it, then try to write what I actually understood — from scratch, in code, with interactive demos. That's The Context Window. Not a course, not a textbook — just the clearest explanation I could write for every concept I've worked hard to understand. Neural networks, transformers, reinforcement learning, reasoning models, search systems, production Python. Written from first principles. YouTube series covering these topics coming soon.

Things here are not perfect — this is simply how I understand them right now. I use AI to help draft and structure the content, so there may be occasional mistakes. If you spot something off or have suggestions, I would genuinely love to hear it.

This blog is for my own learning. I am not claiming that the underlying ideas, papers, or source material are mine. I read and watch material elsewhere — YouTube, blogs, papers, threads — and then rewrite it here in the way I understand it better, often with AI assistance. That is never meant to discredit anyone. I will cite references throughout each series where I can (videos, posts, papers). Something may occasionally be missed; if you notice a missing attribution, please tell me and I will fix it. If this organized blog helps someone else along the way, that is a welcome extra — but the main goal is one place to keep what I am learning.

Abhishek Bisht

Applied AI Research Engineer. Building and writing about the things that matter.

Tokens, Transformers & Truth

A 15-part series covering language models, transformers, attention, RLHF, reasoning, agents, security, and more.

The Transformer Deep Cuts

A 26-part deep-dive series — transformer internals, LLaMA/Mistral/Mamba architectures, Flash Attention, LoRA, quantization, RLHF/DPO, diffusion models, CLIP, multimodal VLMs, and more.

26 tutorials · 5 arcs

Neurons All the Way Down

A 9-part interactive series building neural networks from scratch — from a single neuron and backpropagation all the way to reproducing GPT-2, with code walkthroughs, interactive demos, and hands-on playgrounds.

9 tutorials · micrograd → GPT-2

Finding Needles

A 26-part journey from classical information retrieval to modern agentic search — BM25, embeddings, vector databases, hybrid search, RAG, learning to rank, click models, personalization, and autonomous search agents.

26 tutorials · 7 arcs · AI-Powered Search book deep-dive

Python Under the Hood

14 chapters from data types and control flow to OOP, generators, gotchas, and interview coding challenges.

The Sandbox

Write and run Python code right in your browser. No setup needed — just open and start coding.

Pyodide · instant

The Evolving Transformer

Build landmark transformer architectures from scratch — one codebase, paper by paper. Transformer → GPT-2 → LLaMA → MoE → linear attention → multimodal → DeepSeek V4. Chapters drop sequentially; YouTube videos planned per chapter.

16 architectures · 55+ deltas · code evolution

AI Researcher Course

End-to-end preparation for Applied AI Research Engineer roles — 10 phases covering architectures, pretraining, alignment, RL for reasoning, multimodal, and interview prep. Includes a full syllabus and progress tracker.

10 phases · 6 months · syllabus + tracker

The Gauntlet

75 LLM interview questions — from foundational concepts to senior-level traps drawn from Google DeepMind, OpenAI, Meta, and Anthropic. Tap to reveal answers, track your progress, and test yourself in random quiz mode.

75 questions · 16 topics · self-test

Reasoning from Scratch

A 12-part series on building reasoning LLMs — from chain-of-thought prompting through reinforcement learning foundations (bandits, MDPs, policy gradients, PPO) to training your own reasoning model with GRPO on GSM8K.

12 tutorials · 4 arcs · CoT → RL → GRPO → R1-Zero

Decoding DeepSeek

A 12-part series building every piece of the DeepSeek architecture from scratch — attention, KV cache, MLA, RoPE, Mixture of Experts, multi-token prediction, FP8 quantization, and more. Theory, math, and code for every module.

12 tutorials · 6 arcs · from transcript to blueprint

Beyond the Forward Pass

Deep dives into LLM inference engines, serving systems, and production optimization — paged attention, continuous batching, speculative decoding, prefix caching, GPU VRAM management, and more.

1 tutorial · vLLM deep-dive

The Training Ground

Deep dives into the pre-training forward pass and transformer architecture internals — tokenization, embedding, RMSNorm, GeGLU MLP, attention mechanisms, YaRN positional embeddings, hybrid masking, and the math of FLOPs and cluster sizing.

1 tutorial · dense transformer deep-dive

The Paper Trail

Every research paper referenced across The Context Window — curated with brief descriptions, direct links, and tagged by blog series. Search, filter, and browse the papers behind the concepts.

92 papers · 9 blog series · searchable & paginated

The Archives

76 posts from 2018–2024 — ML fundamentals, NLP, PyTorch, backpropagation, LoRA, speculative decoding, and more. The full early writing history, organised year by year.

76 posts · 2018 – 2024 · year-wise

LlamaChess

Sharpen your tactical vision on a beautiful interactive board — mates, combinations, endgames, and more. Drag pieces, find the winning move, and pick up right where you left off. Free, instant, no account required.

Note: For educational purposes only.

Interactive board · curated puzzle collections · progress saved locally