Low-Rank Adaptation (LoRA)

LoRA (Low-Rank Adaptation) is a technique used to fine-tune (i.e. adapt to specific tasks) pre-trained LLMs efficiently.

Instead of updating all the parameters of a large model during the fine-tuning process, LoRA introduces low-rank matrices that are added to the existing weights of the model. This approach makes it possible to adapt large models to new tasks with less computational resources and faster training times.

The key idea behind LoRA is to decompose the weight updates into low-rank matrices, which can capture the necessary adaptations without the full overhead of modifying the entire model.

Resources

LoRA: Low-Rank Adaptation of Large Language Models
What Is Low-Rank Adaptation (LoRA) | Explained by the Inventor