Generative AI pricing models shift toward usage-based billing across platforms

Facing high inference costs from Generative AI models like OpenAI’s o3-high, companies such as Vercel and Replit are shifting to usage-based pricing aligning revenue with infrastructure but increasing cost unpredictability.

As advanced Gen AI models like OpenAI’s o3-high generate substantial inference costs reaching up to $3,500 per query companies are rethinking their pricing strategies.

Platforms such as Vercel, Bolt.new, and Replit are moving away from traditional flat-rate pricing in favor of usage-based models. This shift aims to better align revenue with the underlying infrastructure and compute costs associated with running large-scale AI systems.

While usage-based pricing provides greater scalability and financial sustainability for providers, it also introduces cost unpredictability for customers, especially those with fluctuating workloads. The trend reflects a broader industry move toward more dynamic, consumption-driven monetization of AI services.

OpenAI