Visa The Embedded Lending Opportunity April 2024 Banner

Tokens, Characters and Usage Fees: Decoding the AI Price War

AI, artificial intelligence, technology

Everything has a cost, even — or especially — innovations like artificial intelligence (AI).

But as enterprises across sectors like healthcare, banking and finance and eCommerce look to leverage the powers of generative AI to streamline their legacy workflows, understanding the total cost of ownership (TCO) for integrating an AI system or LLM (large language model) into business-specific workflows is crucial.

That’s because not only is the generative AI marketplace growing relatively crowded from a capabilities standpoint, but the software pricing models and unit economics of the various AI systems — from providers like Amazon, Google, Microsoft, OpenAI, Anthropic, Meta and others — are also starting to jostle for position.

This week, Alphabet announced it was lowering the costs of its most advanced AI model, the recently launched Gemini from Google.

And as more enterprise AI offerings come to market, such as Amazon’s Q corporate chatbot, winning on cost means strategically tiering subscription and usage fees based on model type and modality in a way that provides enterprise customers with what they want in the way they can afford it.

After all, pricing strategies have been around for ages, and AI offers an attractive new ecosystem to test out the emerging line between where the market’s willingness and reluctance stand.

More here: Demystifying AI’s Capabilities for Use in Payments

What’s in an AI Pricing Model, Anyway?

A quick scan of the pricing pages from GoogleOpenAIMicrosoft, Amazon and Anthropic reveals the array of services businesses can utilize, including text, chat, transcription, visual content and more. 

The pricing structures also introduce a new vernacular: tokens (OpenAI, Microsoft, Amazon, Anthropic) or characters (Google).

Tokens and characters are foundational not just to the way that AI models are priced based on usage, but also to the way that AI systems work.

Language API costs for most major AI models (excluding Google) are based on the model selected — with the most recent and advanced models costing more than earlier iterations — and then the number of input and output tokens.

Natural language processing relies on the process of tokenization to interpret text data by dividing the text into manageable chunks, or tokens. Tokens can represent both words within a set of sentences as well as sections of letters or sub-words.

Typically, a token is equal to four characters, or about three-quarters of a word in English.

See more: Who Will Power the GenAI Operating System?

The pricing for OpenAI’s most advanced model, GPT-4 Turbo, is $0.01 per 1,000 tokens for every input, and $0.03 per 1,000 tokens utilized in each output. 

Amazon’s pricing for Anthropic’s Claude model is slightly lower, at $0.008 per 1,000 tokens per input and $0.024 per 1,000 tokens per output.

Anthropic’s own pricing chart is identical to Amazon’s, only it is scaled up to $8 per one million tokens for prompting (input) and $24 per one million tokens upon query completion (output).

1,000 tokens is about 3 pages of double-spaced text (in English), while a million tokens is around 3,000 pages of double-spaced text.

Google takes a slightly different tack with its pricing, breaking text data apart into characters and not tokens. Its pricing for Gemini is $0.00025 per 1,000 characters every input, and $0.0005 per 1,000 characters every output.

Read also: Peeking Under the Hood of AI’s High-Octane Technical Needs

The Sum Is Greater Than Its Parts

How AI models work is that each token or character representing chunks of text data is then associated with a number, which is then stored or fed into a vector or matrices. Those are then fed into neural networks to generate a deep learning model.

Most AI model pricing strategies are built around English-language based use cases. Prices will vary in other languages as the token and character count changes accordingly.

Image generation services, such as OpenAI’s DALL-E, are priced using a different model that relies on image resolution to establish the price per input and output, while audio models are priced based on the length of the audio being transcribed or generated.

“Knowing about AI will let people who use the tool understand how it works and do their job better,” Akli Adjaoute, founder and general partner at venture capital fund Exponion, told PYMNTS in an interview last month. “Just as with a car, if you show up to the shop without knowing anything, you might get taken advantage of by the mechanic.”

Understanding the impact of query length from a token or character perspective can help firms be more judicious in their use of AI as they fine-tune their prompt engineering.

It is important to note that pricing structures do not correlate to model quality or performance impact, and organizations should assess their specific needs and budget in-step as they look to integrate AI into their workflows, as with any software solution. 

After all, AI models are much more than just their subscription and usage fees. AI system integration, software maintenance, staff training and engineering team salaries also need to be taken into account when it comes to the TCO of an AI system.

For all PYMNTS AI coverage, subscribe to the daily AI Newsletter.