Sign in Subscribe

Reinforcement Learning

SLiC-HF: Sequence Likelihood Calibration with Human Feedback - Summary

The paper presents a new approach called SLiC-HF that uses Sequence Likelihood Calibration with Human Feedback to improve language models. The approach is shown to be effective on the TL;DR summarization task and is a simpler and more computationally efficient alternative to Reinforcement Learning

CAMEL: Communicative Agents for "Mind" Exploration of LLMs - Summary

The paper proposes a novel communicative agent framework named role-playing to facilitate autonomous cooperation among communicative agents and provide insight into their “cognitive” processes. The approach involves using inception prompting to guide chat agents toward task completion while maintai