AI Pricing Isn’t Broken. Your Context Might Be.

Man in a denim shirt looks at a smartphone while holding a credit card, suggesting an online payment or shopping transaction.
by Philip Miller Posted on April 13, 2026

Everyone is complaining about AI pricing right now. Anthropic is tightening usage limits. OpenAI launched a $100/month Pro tier. Third-party tooling economics keep shifting. Users feel like vendors are changing the rules midstream, right as their teams are finally building real workflows around these tools.

The frustration is valid. But most of the conversation is focused on the wrong problem.

The real question isn’t “why is AI getting more expensive?” It’s “why are we burning so many tokens in the first place?”

The Market Is Telling You Something

Here’s the uncomfortable math. Anthropic just hit $30 billion in annualized revenue, surpassing OpenAI’s $25 billion. Model providers are printing money. Meanwhile, Gartner reports that only 28% of enterprise AI use cases in infrastructure and operations fully deliver on their ROI expectations. Research from WRITER’s 2026 Enterprise AI Adoption survey found that 96% of organizations are deploying AI agents, but only 23% see significant ROI from them, while 79% face significant adoption challenges overall.

That asymmetry should stop every enterprise leader in their tracks. Capability investment is running far ahead of the context and governance infrastructure needed to turn that capability into production value. The models are ready. Most organizations’ data layers are not.

This is what the pricing debate is actually about, even if most people don’t realize it yet. AI pricing isn’t broken. It’s reflecting a market where usage is expanding faster than the architecture supporting it.

The Token Tax You’re Paying Without Realizing It

The largest line item in enterprise AI spend isn’t model access. It’s waste.

Every time you dump an unstructured 50-page document into a prompt and ask a model to “find the relevant parts,” you’re paying the model to do work your data architecture should have handled before the prompt was ever sent. You’re asking a frontier model to be your search engine, your classifier, your policy filter, and your answer generator, all in one expensive inference call.

Put real numbers on it. GPT-4o charges roughly $2.50 per million input tokens. Claude 3.5 Sonnet runs around $3.00. That sounds manageable until you consider what enterprise usage looks like: thousands of queries per day, each one stuffing context windows with raw, unstructured, un-enriched content because nothing upstream is curating what the model actually needs to see.

A single poorly architected RAG pipeline processing 1,000 queries a day with 8,000 tokens of padded context per query burns through 8 million input tokens daily. That’s $20–$24 per day on input alone, for one workflow. Scale that across dozens of AI-powered processes, and you’re looking at six figures annually in token costs that better architecture would have prevented.

The fix isn’t a cheaper subscription tier. It’s better context.

Context Before Capability: The Economics

Better context consistently outperforms better models in enterprise environments. A smaller, cheaper model with precisely curated context will outperform an expensive frontier model drowning in irrelevant data, and cost a fraction as much. One pharmaceutical customer found that adding a semantic baseline to their retrieval increased correct answers by 73%, with further refinements adding another 102% improvement. Not by upgrading the model. By upgrading the context.

The enterprises getting AI economics right aren’t negotiating volume discounts on API calls. They’re investing in what happens before the API call: semantic enrichment, intelligent retrieval, policy-based filtering, and structured context delivery. They’re treating their context layer as the primary cost lever, because it is.

How the Progress Data Platform Reduces AI Spend

This is where the Progress Data Platform architecture earns its keep. The Progress Data Platform is a neurosymbolic architecture that combines the flexibility of neural AI with the precision and control of symbolic reasoning across every layer of the stack. The models bring probabilistic power. PDP brings the semantic meaning, deterministic rules, knowledge relationships, and governed workflows that keep that power grounded, efficient, and trustworthy.

Here’s how each layer directly reduces token waste and AI cost inflation:

  • Trusted context through multi-model data and knowledge graphs. PDP stores, indexes, and retrieves complex enterprise content with its semantic relationships intact. Instead of dumping raw documents into a vector store and hoping cosine similarity surfaces the right paragraphs, PDP delivers precisely the content a model needs, with provenance, relationships, and meaning preserved. Its knowledge graph capabilities provide the symbolic backbone: structured relationships and explicit semantic connections that neural models can reason over with far fewer tokens and far less ambiguity. Fewer tokens in, better answers out, defensible outputs every time.
  • Upstream semantic enrichment before inference. PDP automatically tags, classifies, and relates content using enterprise taxonomies and ontologies before any AI interaction occurs. This is symbolic reasoning at scale, applying explicit, human-defined meaning to enterprise data so models don’t have to infer it from raw text. When your data is enriched before it reaches the model, you stop paying the model to figure out what your data means. You tell it. That’s a direct reduction in both token consumption and hallucination risk. One pharmaceutical organization found their semantic ontology became a reusable enterprise asset across multiple domains and use cases far beyond the original AI project.
  • Deterministic decisions at zero token cost. Every decision that can be expressed as a rule, compliance checks, eligibility determinations, routing logic, validation, policy enforcement, is a decision that doesn’t need to go through an LLM. PDP’s rules engine handles these deterministically, with full auditability, at a fraction of the cost and latency. In a world where organizations are deploying AI agents at scale, policy-based guardrails keep those agents compliant without burning tokens on decisions that should never have been probabilistic in the first place.
  • Governed workflow orchestration. PDP ties context, enrichment, rules, and model invocation together into governed workflows, ensuring that models are called only when they add value, and that every upstream step has already narrowed the problem space before a single token is spent. The result: AI workflows that are not only cheaper to run, but more accurate and auditable.

This is the neurosymbolic advantage in practice. Neural gives power. Symbolic gives control. The Progress Data Platform gives you both — and the enterprise AI economics that follow.

Practical Steps to Reduce Your AI Spend

If you’re feeling the pressure of rising AI costs, here are concrete steps worth considering today.

  • Audit your context-to-value ratio. For every AI-powered workflow, measure how many tokens you’re sending versus how many are contributing to the answer. Most teams find 40–60% of their prompt content is noise that better retrieval would have filtered out.
  • Push deterministic decisions out of the model. Any business logic that can be expressed as rules should be handled by a rules engine, not an LLM. Rules are deterministic, auditable, and essentially free per execution compared to API calls. You’d be surprised how many “AI decisions” in enterprise workflows are policy lookups in disguise.
  • Enrich before you retrieve. Semantically tagged content retrieves better and requires less surrounding context to be useful. Investing in classification and enrichment pays for itself in reduced token consumption within weeks, and it makes your outputs more trustworthy, not just cheaper.
  • Right-size your model selection. Not every query needs a frontier model. A well-architected context layer lets you confidently route simpler queries to smaller, cheaper models while reserving expensive inference for tasks that genuinely require reasoning over complex content.
  • Persist knowledge instead of regenerating it. If your system is asking a model the same types of questions repeatedly, you’re paying the same token cost every time. Build knowledge graphs, cache structured outputs, and let your context engine serve answers that have already been derived.

This Is What Makes AI Boring Actually Means

The AI pricing debate reveals something the market needs to hear: most organizations are still treating AI like a novelty rather than infrastructure. They’re optimizing for model access when they should be optimizing for the architecture around the model.

Production-grade enterprise AI isn’t about chasing the latest model release or negotiating a better API rate. It’s about building a governed, context-aware data layer that makes every model interaction more efficient, more accurate, and more defensible. It’s about making AI boring, reliable, predictable, and economically sustainable at scale.

Gartner’s 28% ROI number isn’t a failure of AI capability. It’s a failure of context, governance, and integration. The 72% of projects that stall or fail aren’t using inferior models. They’re sending inferior context to perfectly capable ones.

The enterprises that will thrive as AI pricing matures aren’t the ones with the best subscription deal. They’re the ones who built the context architecture to make every token count.

The models are ready. Is your context layer?

Explore the Progress Data Platform to see how a governed context architecture can reduce your AI inference costs while improving the quality and trustworthiness of every output.

Read our Make AI Boring whitepaper to learn how governed context architecture helps reduce token waste, improve output quality, and make enterprise AI more reliable, explainable, and cost-effective.


Philip Miller

AI Strategist

Philip Miller serves as an AI Strategist at Progress. He oversees the messaging and strategy for data and AI-related initiatives. A passionate writer, Philip frequently contributes to blogs and lends a hand in presenting and moderating product and community webinars. He is dedicated to advocating for customers and aims to drive innovation and improvement within the Progress AI Platform. Outside of his professional life, Philip is a devoted father of two daughters, a dog enthusiast (with a mini dachshund) and a lifelong learner, always eager to discover something new.

More from the author

Related Products:

Data Platform

Solve high-value, high-risk AI challenges where trust, accuracy, and governance matter most with the Progress Data Platform.

Overview

Related Tags

Related Articles

Why “Boring AI” Is the Key to Scaling Trusted Enterprise AI
Trust is now the differentiator: AI capability is rising fast, but enterprise adoption depends on governance, explainability and control. User-first beats tool-first: The winning model is bringing AI into the flow of work, not forcing people to learn complex tooling. Boring is what scales: Predictable, policy-aligned and auditable AI is what turns pilots into production outcomes.

Philip Miller March 12, 2026
Researchers Need Boring AI That Finds What Matters
R&D doesn’t lack data—it lacks signal. AI-driven knowledge discovery only works when answers are grounded in trusted, contextual enterprise data, not probabilistic guesswork. Most AI tools break trust before they create value. Treating research data like generic internet content strips away context, provenance and scientific rigor. Boring, reliable AI wins in 2026. Knowledge discovery that is governed, explainable and embedded into real R&D workflows is what turns AI from pilots into lasting outcomes.
AI Fines Have Started. Now What?
AI fines are no longer theoretical; regulators are now enforcing control requirements as AI moves from pilots to production in financial services. The post explains why contracts alone won’t satisfy supervisors, what evidence regulators will expect to see in production and how organizations can operationalize governed, defensible AI with runtime guardrails, provenance, lineage and auditable controls—setting the agenda for the RegTech Conference in London on March 26.

Philip Miller March 09, 2026
Prefooter Dots
Subscribe Icon

Latest Stories in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation