The Power of Scale for Parameter-Efficient Prompt Tuning - Summary
The paper explores prompt tuning, a mechanism for learning soft prompts to condition frozen language models for specific downstream tasks. The approach outperforms GPT-3's few-shot learning and becomes more competitive with scale. Prompt tuning confers benefits in robustness to domain transfer and
Arxiv URL: https://arxiv.org/abs/2104.08691
Authors: Brian Lester, Rami Al-Rfou, Noah Constant
Summary:
The paper explores prompt tuning, a mechanism for learning soft prompts to condition frozen language models for specific downstream tasks. The approach outperforms GPT-3's few-shot learning and becomes more competitive with scale. Prompt tuning confers benefits in robustness to domain transfer and enables efficient prompt ensembling.
Key Insights & Learnings:
- Prompt tuning outperforms GPT-3's few-shot learning by a large margin.
- Prompt tuning becomes more competitive with scale.
- Prompt tuning confers benefits in robustness to domain transfer.
- Prompt tuning enables efficient prompt ensembling.
- Prompt tuning is a simplification of the recently proposed prefix tuning and is sufficient to be competitive with model tuning.
Terms Mentioned: prompt tuning, soft prompts, downstream tasks, backpropagation, model tuning, pretrained models, ELMo, GPT, BERT, priming, SuperGLUE, prefix tuning, masked language model
Technologies / Libraries Mentioned: T5