Notes for Stanford CS229 Machine Learning

Sun Apr 20 2025 · json · rss

View url →

Subscribe:

youtube

About

Notes for Stanford CS229 Machine Learning

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

What matters when training LLMs?

Archiecture
Training algorithm/loss
Data
Evaluation
Systems

All LLMs are neural networks. When you think about neural networks, you have to think about what architecture you're using.Training algorithm/loss is about how you actually train these models. Data is what do you train these models on. The evaluation, which is how do you know whether you're actually making progress towards the goal of LLMs. The system component, that is like how do you actually make these models run on modern hardware

This lecture will not talk too much about the Archiecture and Training algorithm/loss.

Overview of the LM

Pre-training
post-training

Notes for Stanford CS229 Machine Learning

About

﻿Notes for Stanford CS229 Machine Learning

What matters when training LLMs?

Overview of the LM

Notes for Stanford CS229 Machine Learning