Fine Tuning using PEFT
Overview
Practical fine-tuning scripts using HuggingFace PEFT with QLoRA and BitsAndBytes 4-bit quantisation. Covers both decoder-only (causal LM) and encoder-type models. Useful as a reference for parameter-efficient adaptation of large pre-trained models on consumer hardware.
What’s Included
- QLoRA + BitsAndBytes 4-bit quantisation — load large models on limited VRAM
- PEFT LoRA adapter injection — target Q/V projections, configurable rank and alpha
- Separate script for encoder-type models — e.g. BERT-class architectures
Key Concepts
- 4-bit NormalFloat (NF4) quantisation for base weights
- Double quantisation to further reduce memory
- LoRA adapters trained in full precision on top of quantised base