Perplexity AI is the launch of the AI search engines introduced here since its founding in 2022. In January, Perplexity announced the launch of Sonar, an advanced AI-driven search model built on Meta’s Llama 3.3 70B framework. Designed to provide fast and accurate answers, optimized specifically for Perplexity’s search platform. It also promotes Sonar’s speed. This is the ability to process up to 1,200 tokens per second and provide high quality real-time responses using trusted web sources.
Then on Tuesday, Confusion went to social media to hype Sonar’s performance. In a post on X, Perplexity claimed that Sonar “aligns with the GPT-4O-Mini and Claude 3.5 Haiku to match and outperform top models such as the GPT-4o and Claude 3.5 Sonnet with user satisfaction.” . Now, that may sound impressive on the surface, but the argument doesn’t speak the full story. In fact, it feels a bit misleading without any further context.
We get it – startups like bewildered are always trying to push boundaries. However, such a bold claim must stand up to scrutiny, which raises some questions.
A good look at the claims of Perplexity’s sonar performance
At face value, Perplexity’s statement suggests that Sonar is not only fast, but is superior to some of the most advanced models available, including Openai’s GPT-4o and the Claude 3.5 Sonnet . However, “user satisfaction” is a rather vague metric. Are they measuring speed, accuracy, quality of response, or something else? Without details, it is difficult to know what “outperform” means.
When comparing Sonar, GPT-4o-Mini and Claude 3.5 Haiku, it feels like decks are stacked together. Both of these models are “lighter” versions of the stronger counterparts, optimizing more efficiently than peak performance. So, yeah, sonar might be better than them, but that’s not a fair fight.
Built on the Llama 3.3 70b, Perplexity sonar runs the GPT-4O-MINI and CLAUDE 3.5 HAIKU, matching or exceeding the top GPT-4o and Claude 3.5 models in user satisfaction.
At 1200 tokens/sec, sonar is optimized for the quality and speed of the answer. pic.twitter.com/cnhb39pevv
– Perplexity (@perplexity_ai) February 11, 2025
Compare Porsche with trucks
A better way to see this is through simple analogies. Larger language models (LLMs) like the GPT-4O and Claude 3.5 are like sturdy trucks. inference. They have a lot, but they move at a steady pace. Sonar, on the other hand, is like a Porsche. Lightweight, fast, optimized for specific purposes, in this case, quickly retrieve real-time web data.
Perplexity’s claim that Sonar is “faster” than models like the GPT-4o is like saying, “Look, this Porsche is faster than the truck!” Certainly, that’s true. However, the truck was not built for speed. It was built to carry more weight. Comparing the two without mentioning different purposes is misleading.
Another way to think about it is to compare the microwave to the chef. Microwaves heat food quickly, but that doesn’t mean they’re better than trained chefs who can cook gourmet food from scratch. Similarly, Sonar may get facts faster, but that doesn’t mean that it can think, reason, or create at the same level as GPT-4O or Claude 3.5.
Speed isn’t everything
The confusion claims that SONAR can process 1,200 tokens per second, which is fast. However, speed alone does not improve the model. The quick response is great, but what’s the point when sacrificing depth, consistency or accuracy? Without a solid benchmark for sonar to maintain high quality output at that speed, it simply sounds like marketing fluff.
Why transparency is important
Perplexity AI’s claims about Sonar may sound impressive, but there is no need for a complete backup of them. If “user satisfaction” is a key metric, you need to know not just fact-based queries, but how it is being measured and whether it applies to different tasks.
In industries where accuracy and trust are important, bold statements without solid evidence can backfire. Sonar may shine in certain regions, but it is still unclear whether it will actually “outperform” the biggest name of the LLM space.
Final thoughts
Competition in AI is good. It promotes progress. However, companies need to do in advance what the model can actually do. If Perplexity wants to compete with things like Openai and humanity, clearer benchmarks and more honest claims will go a long way. Until then, it’s up to you to dig a little deeper and see the hype.