Sign in Subscribe

Fine-tuning

Prompt engineering vs. fine-tuning: What’s better for your use case?

Discover the key differences between prompt engineering and model fine-tuning. Learn when to use each approach, how to measure effectiveness and the best tools for optimizing LLM performance.

OpenAI - Fine-tune GPT-4o with images and text

OpenAI - Fine-tune GPT-4o with images and text

OpenAI’s latest update marks a significant leap in AI capabilities by introducing vision to the fine-tuning API. This update enables developers to fine-tune models that can process and understand visual and textual data, opening up new possibilities for multimodal applications. With AI models now able to "

⭐️ Implementing FrugalGPT: Reducing LLM Costs & Improving Performance

⭐️ Implementing FrugalGPT: Reducing LLM Costs & Improving Performance

FrugalGPT is a framework proposed by Lingjiao Chen, Matei Zaharia, and James Zou from Stanford University in their 2023 paper "FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance". The paper outlines strategies for more cost-effective and performant usage of large language model

Mixtral of Experts - Summary

Mixtral of Experts - Summary

The paper introduces Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model that outperforms existing models like Llama 2 70B and GPT-3.5 on various benchmarks. It uses a routing network to select two experts per token, allowing access to 47B parameters while actively using only 13B, enhan

Anyscale's OSS Models + Portkey's Ops Stack

Anyscale's OSS Models + Portkey's Ops Stack

The landscape of AI development is rapidly evolving, and open-source Large Language Models (LLMs) have emerged as a key foundation for building AI applications. Anyscale has been a game-changer here with their fast and cheap APIs for Llama2, Mistral, and more OSS models. But to harness the full

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes - Summary

The paper introduces a new mechanism called Distilling step-by-step that trains smaller models to outperform larger language models (LLMs) while using less training data and smaller model sizes. The mechanism extracts LLM rationales as additional supervision for small models within a multi-task tra

Just Tell Me: Prompt Engineering in Business Process Management - Summary

The paper discusses the use of prompt engineering to leverage pre-trained language models for business process management (BPM) tasks. It identifies the potentials and challenges of prompt engineering for BPM research.