Updated June 25, 2024 to reflect recent updates.
When Anthropic announced Claude 3, which was at the time the newest model in the Claude family, in typical AI fashion, speculation broke out on how Claude 3 Opus compares to GPT-4. A few months later, the newest Claude model is Claude 3.5 Sonnet, which is competing with OpenAI’s newest model, GPT-4o. Benchmarks are showing Claude as a serious competitor, however, there are other factors to consider, such as price and functionality on a service and model level.
Note—we’ve previously touched on Claude vs OpenAI within our Amazon Bedrock vs Azure OpenAI blog but this blog will go more in-depth on Claude and OpenAI GPT models specifically.
Anthropic Claude
In 2023, Claude made waves by being the first to introduce the 100k context window during a time when the other prominent model limits were around 32k. However, it received flack in the past due to censorship issues that were prioritized for improvement with Claude 3. The response to Claude 3 has been extremely positive, with users noticing significant improvements to the prior content moderation and benchmarks showing Claude 3.5 Sonnet to be one of the top-performing language models across a variety of tasks. Claude models particularly excel at writing, summarizing, and coding.
One way to access the Claude models (that we will be focusing on in this blog) is through Amazon Bedrock. Bedrock is a marketplace of many API models from different providers, including Anthropic’s Claude, where Amazon provides additional APIs and security.
Claude Models
Model | Stated Use Cases | Max Tokens |
---|---|---|
Claude Instant | Casual dialogue, document comprehension, summarization, and text analysis. | 100K |
Claude 2.0 | Thoughtful dialogue, content creation, complex reasoning, creativity, and coding. | 100K |
Claude 2.1 | Claude Instant and Claude 2.0 capabilities. Analysis, comparing documents, Q&A, summarization, and trend forecasting. | 200K |
Claude 3 Haiku | Quick and accurate support in live interactions, content moderation, inventory management, knowledge extraction, optimize logistics, and translations. | 200K |
Claude 3 Sonnet | RAG/search and retrieval. Code generation, forecasting, text from images, quality control, product recommendations, and targeted marketing. | 200K |
Claude 3 Opus | Advanced analysis of charts and graphs, brainstorming and hypothesis generation, coding, financials and market trends, forecasting, task automation, and research review. | 200K |
Claude 3.5 Sonnet | RAG/search and retrieval. Code generation, forecasting, text from images, quality control, product recommendations, and targeted marketing. | 200K |
OpenAI GPT
OpenAI’s GPT models are widely used, with many users and benchmarks still considering them the best, however, as we saw in our Gemini vs OpenAI blog, competition has been quickly catching up. Still, GPT-4o is widely regarded as the most advanced and capable language model available today.
Azure and OpenAI are closely related. Microsoft has a strategic partnership with OpenAI, which includes Azure being the exclusive cloud provider for OpenAI’s models and services. This close collaboration allows Azure to offer seamless integration and access to the latest OpenAI models (with additional features and security).
GPT Models
Model | Stated Use Cases | Max Tokens |
---|---|---|
GPT-3.5 Turbo | Advanced complex reasoning and chat, code understanding and generation, and traditional completions tasks. | 4k and 16k tokens |
GPT-4 | Advanced complex reasoning and chat, advanced problem solving, code understanding and generation, and traditional completions tasks. | 8k and 32k tokens |
GPT-4 Turbo | GPT-4 capabilities, instruction following, and parallel function calling. | 128k tokens |
GPT-4 Turbo with Vision | GPT-4 Turbo capabilities, image analysis, and Q&A. | 128k tokens |
GPT-4o | GPT-4 Turbo with Vision capabilities. Faster responses. | 128k tokens |
Claude vs OpenAI GPT Models Functionality
See here for a service-level comparison of Azure and Bedrock documentation/community support and no-code playgrounds. See below for the model-level comparison:
- Max Tokens: As we mentioned previously, Claude was the first to get to a 100k token limit. Then, OpenAI leapfrogged them with an impressive 128k max. Claude now has the lead with a 200k max. That corresponds to an impressive 500 pages of information. With the continuous rapid advancements, such as Gemini’s 1M-10M context window, we expect these limits to keep increasing even more.
- Supported Regions: This applies specifically to Bedrock and Azure. Availability may be model and feature-specific and many regions are not accounted for. See Bedrock and Azure to see if your region is supported.
- Supported Languages: The GPT models are optimized for English, however, they are usable with other languages. Claude supports multiple languages including English, Spanish, and Japanese. Neither have a public list of all available languages.
- Training Data Date: According to Anthropic, the Claude 3 models were trained up to August 2023. GPT-3.5 Turbo and GPT-4 were trained up to September 2021 and the GPT-4 Turbo versions were trained until April 2023.
Claude Pricing
With Bedrock, you have two options: On-Demand and Provisioned Throughput. Fine-tuning is not available. Prices are shown for the US East region.
On-Demand
The non-committal, pay-by-usage option. Charges are per input token processed and output token generated.
Model | Price per 1000 Input Tokens | Price per 1000 Output Tokens |
---|---|---|
Claude Instant | $0.0008 | $0.0024 |
Claude 2.0 | $0.008 | $0.024 |
Claude 2.1 | $0.008 | $0.024 |
Claude 3 Haiku | $0.00025 | $0.00125 |
Claude 3 Sonnet | $0.003 | $0.015 |
Claude 3 Opus | $0.015 | $0.075 |
Claude 3.5 Sonnet | $0.003 | $0.015 |
Provisioned Throughput
You have the option to buy model units (specific throughput measured by the maximum number of input/output tokens processed per minute). Pricing is charged hourly and you can choose a one-month or six-month term. This pricing model is best suited for “large consistent inference workloads that need guaranteed throughput.”
Model | Price per Hour per Model Unit With No Commitment (Max One Custom Model Unit Inference) | Price per Hour per Model Unit With a One Month Commitment (Includes Inference) | Price per Hour per Model Unit With a Six Month Commitment (Includes Inference) |
---|---|---|---|
Claude Instant | $44.00 | $39.60 | $22.00 |
Claude | $70.00 | $63.00 | $35.00 |
OpenAI GPT Pricing
Charges for GPT models through Azure are fairly simple. It is a pay-as-you-go, with no commitment. There are additional customization charges. Price varies per region and is shown for the US East and US East 2 regions (GPT-4 Turbo and fine-tuning are not available in US East).
Pay-As-You-Go
Charges vary for different model types and contexts.
Model | Context | Price per 1000 Input Tokens | Price per 1000 Output Tokens |
---|---|---|---|
GPT-3.5 Turbo Instruct | 4k | $0.0015 | $0.002 |
GPT-3.5 Turbo | 16k | $0.0005 | $0.0015 |
GPT-4 | 8k | $0.03 | $0.06 |
GPT-4 | 32k | $0.06 | $0.12 |
GPT-4 Turbo | 128k | $0.01 | $0.03 |
GPT-4 Turbo With Vision | 128k | $0.01 | $0.03 |
GPT-4o | 128k | $0.005 | $0.015 |
Fine-Tuning
Fine-tuning charges are based on training tokens and hosting time.
Model | Context | Training per 1000 tokens | Price for Hosting per Hour |
---|---|---|---|
GPT-3.5 Turbo | 4k | $0.008 | $3 |
GPT-3.5 Turbo | 16k | $0.008 | $3 |
Pricing Comparison Claude vs OpenAI
Based on benchmarks, user feedback, and use cases the most comparable models are:
Claude 3 Sonnet vs GPT-4
Claude 3 Sonnet is much less expensive than GPT-4 and has similar use cases and benchmarks. Its input tokens are priced 95% lower than GPT-4’s and its output tokens are priced 87.5% lower.
Model | 1000 Input Tokens | 1000 Output Tokens |
---|---|---|
Claude 3 Sonnet | $0.003 | $0.015 |
GPT-4 | $0.06 | $0.12 |
Claude 3 Opus vs GPT-4 Turbo
Claude 3 Opus is relatively expensive, the output tokens in particular are 150% more expensive than GPT-4 Turbo’s. GPT-4 Turbo is the more economical choice, especially for use cases requiring a large number of output tokens.
Model | 1000 Input Tokens | 1000 Output Tokens |
---|---|---|
Claude 3 Opus | $0.015 | $0.075 |
GPT-4 Turbo | $0.01 | $0.03 |
Claude 3.5 Sonnet vs GPT-4o
The two most recent and advanced models Claude 3.5 Sonnet and GPT-4 Turbo, are both more advanced and more cost-effective than their predecessors. While Claude 3 Opus was notably more expensive than GPT-4 Turbo, the pricing structure for these newer models is much more competitive. The price per output token is the same while Claude 3.5 Sonnet has a 40% lower cost for input tokens.
Model | 1000 Input Tokens | 1000 Output Tokens |
---|---|---|
Claude 3.5 Sonnet | $0.003 | $0.015 |
GPT-4o | $0.005 | $0.015 |
Conclusion
Newer models competing for the lowest price while advancing their functionalities is a good sign for end-users. Claude 3.5 Sonnet and GPT-4o are both advanced and capable models, leaving the choice largely dependant on factors such as cost, availability, context window, and more.
Monitor your AWS & Azure costs.