Teaching Instructions
After pre-training, the model can complete text but doesn't know how to be a helpful assistant. Supervised Fine-Tuning (SFT)teaches it to follow instructions by training on high-quality (instruction, response) pairs.
From Reader to Teacher
Pre-training is like reading every book in the library. SFT is like learning how to be a tutor — understanding questions, giving clear explanations, and being helpful rather than just reciting facts.
10K-100K
SFT Examples
~1-5 epochs
Training
Hours-Days
Duration
Quality > Quantity
Key Insight
Chat Format
<|system|>
You are a helpful assistant.
<|user|>
What is the capital of France?
<|assistant|>
The capital of France is Paris.
LoRA: Efficient Fine-Tuning
LoRA (Low-Rank Adaptation) trains only small adapter matrices instead of updating all model weights. This reduces memory by 100x+ and allows fine-tuning on consumer hardware.
Full Fine-Tuning
- • Update all 70B parameters
- • Requires 8×H100 GPUs
- • ~500GB memory
- • Expensive, slow
LoRA
- • Train ~0.1% of parameters
- • Single GPU possible
- • ~16GB memory
- • Fast, affordable
✅
Key Takeaways
- SFT teaches models to follow instructions
- Quality of training data matters more than quantity
- LoRA enables efficient fine-tuning with minimal resources
- Chat templates standardize conversation format