Introduction to Large Language Models


A comprehensive deep dive into Large Language Model (LLM) AI technology that powers ChatGPT and related products. This course covers the full training stack of how models are developed, mental models of how to think about their "psychology", and how to get the best use of them in practical applications. Taught by Andrej Karpathy, founding member at OpenAI and former Sr. Director of AI at Tesla.
Instructor

Andrej Karpathy
AI researcher, educator, and former Director of AI at Tesla. Known for his influential work in deep learning and neural networks, and his educational content including the popular "Neural Networks: Zero to Hero" course series.
Course details
3 hours 21 minutes
video
Not included
Free
What you'll learn
Understand how LLMs are pretrained on internet data
Learn about tokenization and neural network architecture
Explore GPT and Llama model internals
Understand post-training and supervised finetuning
Prerequisites
Basic understanding of programming concepts
Familiarity with Python (helpful but not required)
Interest in AI and machine learning
Curriculum
Intro into the growing LLM ecosystem
ChatGPT interaction under the hood
Basic LLM interactions examples
Be aware of the model you're using, pricing tiers
Thinking models and when to use them
Tool use: internet search
Tool use: deep research
File uploads, adding documents to context
Tool use: python interpreter, messiness of the ecosystem
ChatGPT Advanced Data Analysis, figures, plots
Claude Artifacts, apps, diagrams
Cursor: Composer, writing code
Audio (Speech) Input/Output
Advanced Voice Mode aka true audio inside the model
NotebookLM, podcast generation
Image input, OCR
Image output, DALL-E, Ideogram, etc.
Video input, point and talk on app
Video output, Sora, Veo 2, etc etc.
ChatGPT memory, custom instructions
Custom GPTs
Summary
Notice something missing?
Help us improve this course information for the community