To compete more aggressively with rival AI companies like Google, Openai is launching Flex Processing. This is in exchange for API options that lower the usage price of AI models in exchange for slower response times and “occasionally unavailable resource.”
Available in Openai’s recently released beta versions of O3 and O4-MINI inference models, Flex Processing is intended for low-priority and “unproduction” tasks such as model evaluation, data enrichment, and asynchronous workloads, OpenAI says.
Precisely reduce API costs by half. For O3, Flex Processing is a standard $10/m input token and 40/m output token for $5/m input token (~750,000 words) and $20/m output token. For O4-MINI, Flex lowers the price from the 1.10/m input token and 4.40/m output token token.
The launch of Flex Processing comes as frontier AI prices continue to rise and rivals release cheaper, more efficient, budget-oriented models. On Thursday, Google rolled out Gemini 2.5 Flash. This is an inference model that matches or bests Deepseek’s R1 in terms of performance at lower input token costs.
In an email to customers announcing the launch of Flex Pricing, Openai indicated that developers in 1-3 levels of usage tier need to complete the newly introduced identity verification process to access O3. (Tiers are determined by the amount spent on Openai services.) O3 and other models ‘-Reasoning Summaries and Streaming API support is also behind the verification.
Openai previously said that ID verification was intended to stop bad actors from violating their use policies.
Source link