Interactive Study Guide

Softmax Academy

A practitioner's deep-dive into language models — from tokenization fundamentals through production deployment. Each chapter is a self-contained, interactive explainer built for engineers who learn by doing.

15
Chapters
Follow-ups

Chapters

2
Tokens, Tokenization & Context Windows
How LLMs convert raw text into computable units — and why every engineering decision downstream traces back to the token.
Read chapter →
3
Embeddings & Semantic Representations
How LLMs convert meaning into geometry — and why the quality of that geometry determines retrieval, clustering, and recommendation quality.
Read chapter →
4
Transformer Architecture, Attention & Positional Reasoning
How attention replaced recurrence, why every token can consult every other token, and the engineering consequences of that power.
Read chapter →
5
Pretraining Objectives & Model Families
Why the training objective determines what a model becomes good at — and how to compare model families by design, not brand name.
Read chapter →
6
Classification with LLMs
How to use LLMs as classifiers — and when a smaller, dedicated model is the smarter production choice.
Read chapter →
7
Topic Modeling, Clustering & Theme Discovery
How to move from raw text to interpretable themes — and why cluster discovery is only the beginning.
Read chapter →
8
Retrieval Foundations for LLM Systems
How to find the right evidence before the model generates — because a good LLM answer begins as a good retrieval problem.
Read chapter →
9
Production RAG & Grounded Answering
Moving beyond naive retrieve-and-generate demos to build RAG systems that are grounded, attributable, and production-ready.
Read chapter →
10
Prompting, In-Context Learning & Orchestration
Treating prompts as system design — not clever wording — to build reliable, maintainable, and evaluatable LLM applications.
Read chapter →
11
Multimodal Large Language Models
How LLMs extend beyond text to process images, audio, and video — and why alignment between modalities is the central engineering challenge.
Read chapter →
12
Custom Embeddings & Retrieval Optimization
When off-the-shelf embeddings plateau — how to diagnose gaps, adapt representations, and systematically improve retrieval quality.
Read chapter →
13
Fine-Tuning, PEFT & Adaptation Strategies
When to fine-tune, what method to choose, and how to avoid turning a strong foundation model into an expensive liability.
Read chapter →
14
Optimization & Math Foundations
The mathematical tools that make transformers trainable — softmax, cross-entropy, gradients, and the structures that keep deep networks stable.
Read chapter →
15
Text Generation, Decoding & Serving at Scale
How decoding strategies, serving infrastructure, and safety controls combine into a single operational system that determines what users actually experience.
Read chapter →
16
Architectures, Extensions & Practical Deployment
From sparse expert routing to deployment governance — the decisions that separate a solid prototype from a durable production system.
Read chapter →