SLiC-HF: Sequence Likelihood Calibration with Human Feedback - Summary
The paper presents a new approach called SLiC-HF that uses Sequence Likelihood Calibration with Human Feedback to improve language models. The approach is shown to be effective on the TL;DR summarization task and is a simpler and more computationally efficient alternative to Reinforcement Learning