| Date | Topics | Video from playlist |
|---|---|---|
| September 26th, 2025 | [slides] Lecture 1: Transformer • Background on NLP and tasks • Tokenization • Embeddings • Word2vec, RNN, LSTM • Attention mechanism • Transformer architecture | ![]() 1:41:58 |
| October 3rd, 2025 | [slides] Lecture 2: Transformer-based models & tricks • Attention approximation • MHA, MQA, GQA • Position embeddings (regular, learned) • RoPE and applications • Transformer-based architectures • BERT and its derivatives | ![]() 1:47:19 |
| October 10th, 2025 | [slides] Lecture 3: Large Language Models • Definition and architecture • Mixture of experts • Context length, temperature • Sampling strategies • Prompting, in-context learning • Chain of thought • Self-consistency | ![]() 1:48:44 |
| October 17th, 2025 | [slides] Lecture 4: LLM training • Pretraining • Quantization • Hardware optimization • Supervised finetuning (SFT) • Parameter-efficient finetuning (LoRA) | ![]() 1:47:27 |
| October 24th, 2025 | Midterm [exam] [solutions] | |
| October 31st, 2025 | [slides] Lecture 5: LLM tuning • Preference tuning • RLHF overview • Reward modeling • RL approaches (PPO and variants) • DPO | ![]() 1:47:42 |
| November 7th, 2025 | [slides] Lecture 6: LLM reasoning • Reasoning models • RL for reasoning • GRPO • Scaling | ![]() 1:47:10 |
| November 14th, 2025 | [slides] Lecture 7: Agentic LLMs • Retrieval-augmented generation • Advanced RAG techniques • Function calling • Agents • ReAct framework | ![]() 1:49:23 |
| November 21st, 2025 | [slides] Lecture 8: LLM evaluation • LLM-as-a-judge overview • Best practices and benefits • Biases and pitfalls | ![]() 1:49:25 |
| December 5th, 2025 | [slides] Lecture 9: Current trends • Recap • Trending topics • Closing thoughts | ![]() 1:51:31 |
| December 10th, 2025 | Final [exam] [solutions] | |
CME 295 







