Interactive Study Guide

Softmax Academy

A practitioner's deep-dive into language models — from tokenization fundamentals through production deployment. Each chapter is a self-contained, interactive explainer built for engineers who learn by doing.

Chapters

∞

Follow-ups

Tokens, Tokenization & Context Windows

How LLMs convert raw text into computable units — and why every engineering decision downstream traces back to the token.

Read chapter →

Embeddings & Semantic Representations

How LLMs convert meaning into geometry — and why the quality of that geometry determines retrieval, clustering, and recommendation quality.

Read chapter →

Transformer Architecture, Attention & Positional Reasoning

How attention replaced recurrence, why every token can consult every other token, and the engineering consequences of that power.

Read chapter →

Pretraining Objectives & Model Families

Why the training objective determines what a model becomes good at — and how to compare model families by design, not brand name.

Read chapter →

Classification with LLMs

How to use LLMs as classifiers — and when a smaller, dedicated model is the smarter production choice.

Read chapter →

Topic Modeling, Clustering & Theme Discovery

How to move from raw text to interpretable themes — and why cluster discovery is only the beginning.

Read chapter →

Retrieval Foundations for LLM Systems

How to find the right evidence before the model generates — because a good LLM answer begins as a good retrieval problem.

Read chapter →

Production RAG & Grounded Answering

Moving beyond naive retrieve-and-generate demos to build RAG systems that are grounded, attributable, and production-ready.

Read chapter →

Prompting, In-Context Learning & Orchestration

Treating prompts as system design — not clever wording — to build reliable, maintainable, and evaluatable LLM applications.

Read chapter →

Multimodal Large Language Models

How LLMs extend beyond text to process images, audio, and video — and why alignment between modalities is the central engineering challenge.

Read chapter →

Custom Embeddings & Retrieval Optimization

When off-the-shelf embeddings plateau — how to diagnose gaps, adapt representations, and systematically improve retrieval quality.

Read chapter →

Fine-Tuning, PEFT & Adaptation Strategies

When to fine-tune, what method to choose, and how to avoid turning a strong foundation model into an expensive liability.

Read chapter →

Optimization & Math Foundations

The mathematical tools that make transformers trainable — softmax, cross-entropy, gradients, and the structures that keep deep networks stable.

Read chapter →

Text Generation, Decoding & Serving at Scale

How decoding strategies, serving infrastructure, and safety controls combine into a single operational system that determines what users actually experience.

Read chapter →

Architectures, Extensions & Practical Deployment

From sparse expert routing to deployment governance — the decisions that separate a solid prototype from a durable production system.

Read chapter →