Easily build complex reports
Monitoring and efficiency metrics
Custom cost allocation tags
Network cost visibility
Organizational cost hierarchies
Budgeting and budget alerts
Discover active resources
Consumption-based insights
Alerts for unexpected charges
Automated AWS cost savings
Discover cost savings
Unified view of AWS discounts
COGS and business metrics
Model savings plans
Collaborate on cost initiatives
Create and manage your teams
Automate cloud infrastructure
Cloud cost issue tracking
Detect cost spikes
by Emily Dunenfeld
Contents
Updated June 25, 2024 to reflect recent updates.
When Anthropic announced Claude 3, which was at the time the newest model in the Claude family, in typical AI fashion, speculation broke out on how Claude 3 Opus compares to GPT-4. A few months later, the newest Claude model is Claude 3.5 Sonnet, which is competing with OpenAI’s newest model, GPT-4o. Benchmarks are showing Claude as a serious competitor, however, there are other factors to consider, such as price and functionality on a service and model level.
Note—we’ve previously touched on Claude vs OpenAI within our Amazon Bedrock vs Azure OpenAI blog but this blog will go more in-depth on Claude and OpenAI GPT models specifically.
In 2023, Claude made waves by being the first to introduce the 100k context window during a time when the other prominent model limits were around 32k. However, it received flack in the past due to censorship issues that were prioritized for improvement with Claude 3. The response to Claude 3 has been extremely positive, with users noticing significant improvements to the prior content moderation and benchmarks showing Claude 3.5 Sonnet to be one of the top-performing language models across a variety of tasks. Claude models particularly excel at writing, summarizing, and coding.
One way to access the Claude models (that we will be focusing on in this blog) is through Amazon Bedrock. Bedrock is a marketplace of many API models from different providers, including Anthropic’s Claude, where Amazon provides additional APIs and security.
Table of supported Claude models (as of 6/25/2024).
OpenAI’s GPT models are widely used, with many users and benchmarks still considering them the best, however, as we saw in our Gemini vs OpenAI blog, competition has been quickly catching up. Still, GPT-4o is widely regarded as the most advanced and capable language model available today.
Azure and OpenAI are closely related. Microsoft has a strategic partnership with OpenAI, which includes Azure being the exclusive cloud provider for OpenAI’s models and services. This close collaboration allows Azure to offer seamless integration and access to the latest OpenAI models (with additional features and security).
Table of supported GPT models (as of 6/25/2024).
See here for a service-level comparison of Azure and Bedrock documentation/community support and no-code playgrounds. See below for the model-level comparison:
With Bedrock, you have two options: On-Demand and Provisioned Throughput. Fine-tuning is not available. Prices are shown for the US East region.
The non-committal, pay-by-usage option. Charges are per input token processed and output token generated.
Claude On-Demand pricing table. (Updated 6/25/24).
You have the option to buy model units (specific throughput measured by the maximum number of input/output tokens processed per minute). Pricing is charged hourly and you can choose a one-month or six-month term. This pricing model is best suited for “large consistent inference workloads that need guaranteed throughput.”
Claude Provisioned Throughput pricing table. (Updated 6/25/24).
Charges for GPT models through Azure are fairly simple. It is a pay-as-you-go, with no commitment. There are additional customization charges. Price varies per region and is shown for the US East and US East 2 regions (GPT-4 Turbo and fine-tuning are not available in US East).
Charges vary for different model types and contexts.
OpenAI GPT model pricing table (as of 6/25/2024).
Fine-tuning charges are based on training tokens and hosting time.
OpenAI GPT fine-tuning pricing table (as of 6/25/2024).
Based on benchmarks, user feedback, and use cases the most comparable models are:
Claude 3 Sonnet is much less expensive than GPT-4 and has similar use cases and benchmarks. Its input tokens are priced 95% lower than GPT-4’s and its output tokens are priced 87.5% lower.
Claude 3 Opus is relatively expensive, the output tokens in particular are 150% more expensive than GPT-4 Turbo’s. GPT-4 Turbo is the more economical choice, especially for use cases requiring a large number of output tokens.
The two most recent and advanced models Claude 3.5 Sonnet and GPT-4 Turbo, are both more advanced and more cost-effective than their predecessors. While Claude 3 Opus was notably more expensive than GPT-4 Turbo, the pricing structure for these newer models is much more competitive. The price per output token is the same while Claude 3.5 Sonnet has a 40% lower cost for input tokens.
Newer models competing for the lowest price while advancing their functionalities is a good sign for end-users. Claude 3.5 Sonnet and GPT-4o are both advanced and capable models, leaving the choice largely dependant on factors such as cost, availability, context window, and more.
MongoDB Atlas is the cost-effective choice for production workloads where high-availability is a requirement.
Grafana is a strong competitor to the monitoring and observability features of Datadog for a fraction of the price.
AWS is implementing a policy update that will no longer allow Reserved Instances and Savings Plans to be shared across end customers.