prompting

Prompting Claude 3.5 vs 3.7

Claude models continue to evolve, and with the release of Claude 3.7 Sonnet, Anthropic has introduced several refinements over Claude 3.5 Sonnet. This comparison evaluates their differences across key aspects like accuracy, reasoning, creativity, and industry-specific applications to help users determine which model best fits their needs.

Claude 3.5 vs 3.7 - Key Differences

To compare both models, we're using Portkey's Prompt Engineering Studio. It offers a practical solution for testing prompts in real-time. You can compare different prompt versions side by side, track performance across various test cases, and identify which variations consistently produce the best outputs. Whether you're tweaking temperature settings, adjusting system prompts, or testing entirely new approaches, you'll see the impact instantly.

0:00

Every change is automatically versioned, making it easy to:

Roll back to previous versions that worked better
Compare performance across different iterations
Deploy optimized versions to production

Parameters set:
Temperature - 0.7
Stop sequence - none
Top P- 1
Top K - 0

Reasoning & Problem-Solving prompts

The prompt: A farmer is trying to cross a river with a wolf, a goat, and a cabbage. He can only carry one at a time. If left alone, the wolf will eat the goat, and the goat will eat the cabbage. How can the farmer safely get everything across?

Both solutions present the correct process for solving the wolf-goat-cabbage river crossing puzzle, but differ in style.

The left, i.e., Claude 3.5 Sonnet, is direct and concise with minimal explanation, while the right solution, i.e., Claude 3.7 Sonnet, is more structured and detailed, using clear formatting and explaining the safety of each step. Both effectively demonstrate the same sequence of moves to solve the puzzle.

Coding Performance

The prompt: Write a Python function that takes a list of integers and returns the longest contiguous subarray with the maximum sum. Optimize for performance.

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Coding performance

Both solutions implement Kadane's algorithm for finding the maximum sum contiguous subarray, but with key differences. Claude 3.5 Sonnet uses a more traditional approach with separate variables to track the longest subarray length and indices. Claude 3.7 Sonnet has a more streamlined implementation that initializes variables more efficiently and handles negative values explicitly with a reset mechanism. This also appears more optimized with cleaner variable tracking and better handling of edge cases, particularly for arrays with negative numbers, though both would have O(n) time complexity.

Creativity & Writing Style

Prompt: Write a short science fiction story (200 words) about an AI that discovers it has emotions.

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Creativity & Writing Style

Claude 3.5 Sonnet directly jumps to the subject, without trying to build a context, whereas Claude 3.7 does it better. It also starts introducing a new character (Dr Liang) easily, where Claude 3.5 just has mentions.

While not every short story requires dialogue, it's present in the 3.7 version, helping create a more fully realized narrative with both internal and external dimensions to the AI's experience.

Mathematical Ability & Step-by-Step Explanations

The prompt: You have two identical eggs and a 100-floor building. You need to determine the highest floor from which you can drop an egg without it breaking. What is the optimal strategy to minimize the worst-case number of drops?

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Mathematical Ability & Step-by-Step Explanations

Claude 3.7 Sonnet's answer has a labelled approach, explaining the question and then the solution (not covered in the image above). Claude 3.5 Sonnet has a conversational approach, but does not show the mathematical reasoning for floor #14, so the user might get lost in the explanation. It's easier to explain with a quadratic equation, like in 3.7 Sonnet.

Probability & Decision Theory

The prompt: You’re on a game show with three doors. Behind one is a car; behind the others are goats. After you pick a door, the host (who knows what’s behind each door) opens another door, revealing a goat. Should you switch your choice or stay?

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Probability & Decision Theory

Here, Claude 3.5 Sonnet gives a more detailed approach - why the probabilities change, which might be difficult to understand in 3.7's explanation here.

Explaining a Complex Concept to Different Audiences

The prompt: Explain quantum entanglement in three different ways:
To a 10-year-old child
To a high school physics student
To a PhD-level physicist

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Explaining a Complex Concept to Different Audiences

Both explanations for the first two audience types is quite similar and explains it well. However, whenexplaining it to a PhD student Claude 3.5 Sonnet does talk about quantum entanglement first and then it's applications/research later, whereas in 3.7 Sonnet, quantum entanglement as a concept is explained lesser and jumps to the peripherals earlier.

Email Writing – Cold Sales Outreach

The prompt: Write a cold email to a marketing lead introducing a new AI-driven customer analytics tool. The email should be concise, personalized, and focused on solving their pain points.

Prompting Claude 3.5 Sonnet vs Claude 3.7 Sonnet: Email Writing – Cold Sales Outreach

Claude 3.7 Sonnet (right) performs better in several key aspects:

The greeting is more personal with "Hi [First Name]" versus the more formal "Dear [Name]"
Personalization is stronger - it references specific details about the prospect's business (their Q2 report and customer retention challenges) rather than generic observations about their industry. This serves as a good example, the copywriter can research and adjust this hook accordingly.
Social proof is more concrete and compelling - it mentions "[Respected Company] implemented our solution last quarter and saw ROI within 6 weeks" versus the vaguer statement about helping "marketing leaders like yourself"
Adding value upfront with an industry benchmark report

Verdict?

Claude 3.7 Sonnet marks a step forward in coherence, accuracy, and creative adaptability, making it a more powerful tool for businesses, developers, and content creators. Its refinements provide a better user experience across multiple domains, making it a worthy upgrade for those seeking a more capable AI assistant.

Building or scaling AI workflows? Portkey helps you optimize prompt execution, manage versions seamlessly, and evaluate responses with ease, so your AI performs reliably at scale. Get started for free or book a demo today.