The Power of Scale for Parameter-Efficient Prompt Tuning - Summary

The paper explores prompt tuning, a mechanism for learning soft prompts to condition frozen language models for specific downstream tasks. The approach outperforms GPT-3's few-shot learning and becomes more competitive with scale. Prompt tuning confers benefits in robustness to domain transfer and

Arxiv URL: https://arxiv.org/abs/2104.08691

Authors: Brian Lester, Rami Al-Rfou, Noah Constant

Summary:

The paper explores prompt tuning, a mechanism for learning soft prompts to condition frozen language models for specific downstream tasks. The approach outperforms GPT-3's few-shot learning and becomes more competitive with scale. Prompt tuning confers benefits in robustness to domain transfer and enables efficient prompt ensembling.

Key Insights & Learnings:

  • Prompt tuning outperforms GPT-3's few-shot learning by a large margin.
  • Prompt tuning becomes more competitive with scale.
  • Prompt tuning confers benefits in robustness to domain transfer.
  • Prompt tuning enables efficient prompt ensembling.
  • Prompt tuning is a simplification of the recently proposed prefix tuning and is sufficient to be competitive with model tuning.


Terms Mentioned: prompt tuning, soft prompts, downstream tasks, backpropagation, model tuning, pretrained models, ELMo, GPT, BERT, priming, SuperGLUE, prefix tuning, masked language model

Technologies / Libraries Mentioned: T5