Do you want smarter insights into your inbox? Register for our weekly newsletters to get only what is important for Enterprise AI, data and security leaders. Subscribe now
An extensive New study has shown that open-source artificial intelligence models use considerably more computer sources than their competitors with closed source in performing identical tasks, possibly undermine their cost benefits and reform how companies evaluate AI-implementation strategies.
The research conducted by AI Firm Nous researchdiscovered that open-weight models use between 1.5 and 4 times more tokens-the basic units of AI calculation-this closed models such as those of Openi And Anthropic. For simple knowledge questions, the gap was dramatically expanded, with a few open models with a maximum of 10 times more tokens.
Measuring thinking efficiency in reasoning models: the missing benchmarkhttps://t.co/B1E1RJX6VZ
We have measured token use about reasoning models: open models output 1.5-4x more tokens than closed models for identical tasks, but with enormous variance depending on the task type (until … pic.twitter.com/ly1083won8
– Nous Research (@nousesearch) August 14, 2025
“Open weight models use 1.5–4 × more tokens than closed (up to 10 × for simple knowledge questions), which means that they are sometimes more expensive per query despite lower costs per token,” the researchers wrote in their report that was published on Wednesday.
The findings challenge a prevailing assumption in the AI industry that open-source models offer clear economic benefits compared to their own alternatives. Although Open-Source models usually to pay less per token costs, the study suggests that this benefit can be “easily compensated if they need more tokens to reason about a certain problem.”
Ai -scale distribution touches its limits
Power caps, rising token costs and inference inference reform Enterprise AI. Become a member of our exclusive salon to discover how top teams are:
- Change energy into a strategic advantage
- Architecting efficient conclusion for real transit profits
- Unlocking competitive ROI with sustainable AI systems
Secure your place to stay ahead: https://bit.ly/4mwgngo
The actual costs of AI: why ‘cheaper’ models can break your budget
The investigation investigated 19 different AI models About three categories of tasks: basic knowledge questions, mathematical problems and logical puzzles. The team has measured “token efficiency” – how many computational units models use with regard to the complexity of their solutions – a statistics that have received little systematic research, despite the considerable implications of the costs.
“Token -efficiency is a critical statistics for various practical reasons,” the researchers noted. “Although hosting open weight models can be cheaper, this cost benefit can easily be compensated if they need more tokens to reason about a certain problem.”
The inefficiency is particularly pronounced for major reasoning models (LRMs), which use extensively “Thought chains’To solve complex problems. These models, designed to think about step -by -step problems, can consume thousands of tokens that think about simple questions that require minimal calculation.
For basic knowledge questions such as “What is the capital of Australia?” The study showed that reasoning models “spend hundreds of tokens thinking about simple knowledge questions” that can be answered in one word.
Which AI models actually deliver for your money
The research showed that major differences between model providers. OpenAi’s models, especially being o4-mini and newly released open source GPT-Osss Variants, demonstrated exceptional token efficiency, especially for mathematical problems. The study showed that OpenAI models “stand out for extreme token efficiency in math problems”, using a maximum of three times fewer tokens than other commercial models.
Under open-source options, nvidia’s Llama-3.3-Memotron-Super-49B-V1 Returned as “the most token -efficient open weight model in all domains”, while newer models of companies such as Magistral showed “exceptionally high use of token” as outbijters.
The efficiency gorge varied considerably per task type. While open models used about twice as many tokens for mathematical and logical problems, the difference is that balloon for simple knowledge questions where efficient reasoning should be unnecessary.

What corporate leaders need to know about AI computer costs
The findings have immediate implications for AI acceptance of companies, whereby calculation costs can be scaled quickly with use. Companies that evaluate AI models often focus on accuracy penchmarks and prices per token, but can overlook the total calculation requirements for Real-World tasks.
“The better token efficiency of closed weight models often compensates for the higher API prices of those models,” the researchers thought when analyzing the total inference costs.
The study also showed that providers of closed-source model seem to be actively optimizing for efficiency. “Models with closed weight are iteratively optimized to use fewer tokens to reduce the inference costs”, while open-source models “have increased their token use for newer versions, which may reflect a priority for better reasoning performance.”

How researchers have cracked the code on AI efficiency measurement
The research team stood for unique challenges in measuring efficiency in various model architectures. Many models with closed source do not reveal their rough reasoning processes, instead they offer compressed summaries of their internal calculations to prevent competitors from copying their techniques.
To tackle this, researchers used completion sticks – the total account units that were invoiced for each query – as a proxy for reasoning effort. They discovered that “the most recent closed source models will not share their unprocessed reasoning traces” and instead “use smaller language models to transcribe the idea into summaries or compressed representations.”
The methodology of the study included testing with modified versions of known problems to minimize the influence of remembering solutions, such as changing variables into mathematical competitive problems of the American Invitational Mathematics Examination (AIME).

The future of AI efficiency: what will come afterwards
The researchers suggest that token efficiency should become a primary optimization goal, in addition to accuracy for future model development. “A more compacted cot will also make a more efficient use of context and can prevent context degradation during challenging reasoning tasks,” They wrote.
The Release of OpenAI’s Open Source GPT-OSS modelswho demonstrate the advanced efficiency with ‘freely accessible bedspreading’ could serve as a reference point for optimizing other open-source models.
The full research data set and evaluation code are Available on Githubso that other researchers can validate and expand the findings. Since the AI industry drives to more powerful reasoning opportunities, this study suggests that the real competition may not be about who can build the smartest AI – but who can build the most efficient.
In a world where every token counts, the most wasteful models can finally be priced from the market, regardless of how well they can think.



