Overview
LoRA (Low-Rank Adaptation) enables fine-tuning of large models by training only a small number of additional parameters. QLoRA adds 4-bit quantization to reduce memory further.
Requirements
pip install transformers peft bitsandbytes accelerate datasets
Loading a Model with QLoRA
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True
)
model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-v0.1",
quantization_config=bnb_config,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
Configuring LoRA
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
model = prepare_model_for_kbit_training(model)
lora_config = LoraConfig(
r=16, # Rank
lora_alpha=32, # Scaling factor
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()
# trainable params: 4,194,304 || all params: 3,752,071,168 || trainable%: 0.11%
Training
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(
output_dir="./lora-mistral",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
fp16=True,
logging_steps=10,
save_strategy="epoch"
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
tokenizer=tokenizer
)
trainer.train()
Merging and Saving
# Save LoRA adapter
model.save_pretrained("./lora-adapter")
# Merge with base model for inference
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged-model")
Memory Requirements
| Model Size | Full Fine-tune | LoRA | QLoRA |
|---|---|---|---|
| 7B | 56 GB | 16 GB | 6 GB |
| 13B | 104 GB | 32 GB | 10 GB |
| 70B | 560 GB | 160 GB | 48 GB |
💬 Comments
Comments are coming soon! We're setting up our discussion system.
In the meantime, feel free to contact us with your feedback.