Portkey Blog Portkey Blog
  • Home
  • Production Guides
  • New Releases
  • Talks
  • Upcoming Events
  • Portkey Docs
Sign in Subscribe

Parameters

Mixtral of Experts - Summary

Mixtral of Experts - Summary

The paper introduces Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model that outperforms existing models like Llama 2 70B and GPT-3.5 on various benchmarks. It uses a routing network to select two experts per token, allowing access to 47B parameters while actively using only 13B, enhan
The Quill 09 Jan 2024

Language Models are Few-Shot Learners - Summary

The paper discusses the limitations of pre-trained language representations in NLP systems and the need for task-specific datasets and fine-tuning. The authors show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with pri
Rohit Agarwal 15 Apr 2023

Subscribe to Portkey Blog

  • Blog Home
  • Portkey Website
Portkey Blog © 2026. Powered by Ghost