LoRA: Low-Rank Adaptation of Large Language Models - Summary
The paper proposes Low-Rank Adaptation (LoRA) as an approach to reduce the number of trainable parameters for downstream tasks in natural language processing. LoRA injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable