Fine Tuning using PEFT

Fine-tuning PyTorch Custom

Overview

Practical fine-tuning scripts using HuggingFace PEFT with QLoRA and BitsAndBytes 4-bit quantisation. Covers both decoder-only (causal LM) and encoder-type models. Useful as a reference for parameter-efficient adaptation of large pre-trained models on consumer hardware.

What’s Included

QLoRA + BitsAndBytes 4-bit quantisation — load large models on limited VRAM
PEFT LoRA adapter injection — target Q/V projections, configurable rank and alpha
Separate script for encoder-type models — e.g. BERT-class architectures

Key Concepts

4-bit NormalFloat (NF4) quantisation for base weights
Double quantisation to further reduce memory
LoRA adapters trained in full precision on top of quantised base

Papers

LoRA — Hu et al., 2022
QLoRA — Dettmers et al., 2023

Yuvraj Singh

Overview

What’s Included

Key Concepts

Papers