paper summaries

Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding - Summary

This paper introduces the Skeleton-of-Thought (SoT) method to decrease the generation latency of large language models (LLMs). SoT guides LLMs to first generate the skeleton of the answer and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point. The m

Arxiv URL: https://arxiv.org/abs/2307.15337

Authors: Xuefei Ning, Zinan Lin, Zixuan Zhou, Huazhong Yang, Yu Wang

Summary:

This paper introduces the Skeleton-of-Thought (SoT) method to decrease the generation latency of large language models (LLMs). SoT guides LLMs to first generate the skeleton of the answer and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point. The method shows potential for improving answer quality and achieving considerable speed-up.

Key Insights & Learnings:

The sequential decoding approach used by state-of-the-art LLMs contributes to high generation latency.
SoT accelerates the generation process by producing different parts of answers in parallel.
SoT can potentially improve answer quality in terms of diversity and relevance.
The thinking and writing process of humans inspired the development of SoT.
SoT opens up possibilities for further research on optimizing LLMs' thinking process.

Terms Mentioned: large language models, generation latency, sequential decoding, Skeleton-of-Thought, API calls, batched decoding, answer quality, diversity, relevance, thinking process, human-inspired, optimization

Technologies / Libraries Mentioned:

⭐ Reducing LLM Costs & Latency with Semantic Cache

Implementing semantic cache from scratch for production use cases.

Instruction Tuning with GPT-4 - Summary

The paper presents the first attempt to use GPT-4 to generate instruction-following data for Large Language Models (LLMs) finetuning. The 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks compared to the instruction-following

Are We Really Making Much Progress in Text Classification? A Comparative Review - Summary

This paper reviews and compares methods for single-label and multi-label text classification, categorizing them into bag-of-words, sequence-based, graph-based, and hierarchical methods. The findings reveal that pre-trained language models outperform all recently proposed graph-based and hierarchy-b

Read next