Fine Tuning using PEFT

Fine-tuning PyTorch Custom
GitHub →

Overview

Practical fine-tuning scripts using HuggingFace PEFT with QLoRA and BitsAndBytes 4-bit quantisation. Covers both decoder-only (causal LM) and encoder-type models. Useful as a reference for parameter-efficient adaptation of large pre-trained models on consumer hardware.

What’s Included

  • QLoRA + BitsAndBytes 4-bit quantisation — load large models on limited VRAM
  • PEFT LoRA adapter injection — target Q/V projections, configurable rank and alpha
  • Separate script for encoder-type models — e.g. BERT-class architectures

Key Concepts

  • 4-bit NormalFloat (NF4) quantisation for base weights
  • Double quantisation to further reduce memory
  • LoRA adapters trained in full precision on top of quantised base

Papers

  • LoRA — Hu et al., 2022
  • QLoRA — Dettmers et al., 2023