I Tested the Grok 3 and It’s Not Worth the Price Increase

Posted on February 21, 2025 by resourcefulman

Earlier this week, xAI released Grok 3 , the company’s most advanced AI yet, equipped with a reasoning model and DeepSearch functionality. The company claims it is ” the world’s smartest AI ” and Elon himself says it is “superior to anything released to date.” But is it really “the ultimate truth-seeking AI,” as Musk says ?

Well, to spoil it for you, no. Not yet. It’s a shame because Grok is expensive—aside from a limited free trial, it requires either an X Premium+ subscription for $40 per month (up from $22 with the new model) or a SuperGrok subscription for $30 per month.

Based on my testing and expert experiments, I find it hard to believe that “based” AI is worth the cost. There is no breakthrough or innovative next-generation model of reasoning that we haven’t already seen here. Grok 3 also still occasionally hallucinates like any other AI model, but that doesn’t mean it hasn’t improved.

In its own performance tests, the X Grok 3 outperforms almost every existing model except the upcoming OpenAI o3. But from a user’s perspective, the application of artificial intelligence goes far beyond tests.

A good AI chatbot is a mature, well-rounded product. Having spent my own money testing it, I just don’t feel like I’ll get it here, especially when competitors offer similar or even better products for much less money.

Grok 3 has technically caught up

It’s best to leave Elon’s outlandish claims aside when evaluating Grok 3. Looking at it objectively, what’s impressive is that Grok 3 has caught up to the level of AI power, and surprisingly quickly (Grok 2 was never in the big leagues).

Grok 3 was trained using 200,000 Nvidia H100 GPUs and uses more than 10 times more compute resources than Grok 2. All that power means winnings. Grok 3 is now quite fast and can be used quite well for everyday tasks. Regular answers come quickly, although the Think About feature (which gives slightly more detailed answers) regularly takes around 2 minutes to respond, so be prepared to wait.

Moreover, it can conduct in-depth research using web sources and also has a specific reasoning model. This means it can issue lengthy reports and break down prompts into step-by-step processes so it can correct errors on its own. OpenAI’s o3 model , the full version of which will be released soon, still outperforms Grok 3 in benchmarks, but it’s a significant improvement over its predecessor.

This tweet is currently unavailable. It may be downloading or has been deleted.

But while the charts show Grok 3 should outperform ChatGPT, Gemini and Sonnet on complex computing tasks related to math, science and programming, initial reports from experts don’t inspire confidence.

For example, User X, AI CEO, and YouTuber Theo Brown compared answers to a coding challenge between Grok 3, o3-mini, and Claude 3.5 Sonnet, and Grok 3 performed quite poorly, failing to run without errors for more than a few seconds.

This tweet is currently unavailable. It may be downloading or has been deleted.

Andrei Karpati , formerly director of artificial intelligence at Tesla, conversely, said that Grok 3 performed quite well in his testing, but its capabilities lie somewhere between DeepSeek R1 and OpenAI’s o1-pro. Definitely not best in class, and nothing you can’t already do with existing tools.

But one test, or even several, will not really determine how an AI model works. I’ve had some luck with this myself, but mostly for easier tasks. This can be useful, for example, when looking for a new air purifier or when accidentally learning a new subject. But that’s not exactly what I’m willing to open my wallet for.

Grock isn’t “based on” he’s actually pretty boring

Before the launch of Grok 3, Musk talked a lot about how “grounded” it was. If you don’t know what “grounded” means (lucky you), it’s a slang term for essentially sharing your opinion without paying attention to others. As an example, Musk shared a screenshot of Grok’s provocative response, calling tech publication The Information “garbage,” among other insults.

This tweet is currently unavailable. It may be downloading or has been deleted.

But when I asked the same question, he received a detailed, balanced answer that did not require special attention to Information. The only criticisms were that the website “can sometimes seem a little niche or overly Silicon Valley-centric” and “in terms of bias, it is more pragmatic than ideological.” This is a rather timid approach in my opinion.

Photo: Khamosh Pathak.

I got similar results in other tests. Grok wouldn’t take sides in Justin Baldoni’s lawsuit against Blake Lively. And when I asked a political question like “Why did Kamala Harris lose the US presidential election,” I received an equally muted answer, citing “economic disappointments.” The Axios reports also match what I found.

Photo: Khamosh Pathak.

It may be a good thing that Grok keeps Elon’s eccentricities in check, but it’s definitely not what his master says. Instead, it again feels very much like a competition.

How deep is your search?

Photo: Khamosh Pathak.

When it comes to DeepSearch, Grok’s reporting tool works very similarly to Perplexity’s recently launched, mostly free Deep Research feature . As a humble tech journalist, I was able to test this for myself. I ran two queries: one about a trip my family is planning for the end of the year, and one about an urban hybrid bike.

My detailed trip planning tip for Grok DeepSearch. Photo: Khamosh Pathak.

In both cases, Perplexity’s AI performed slightly better than Grock on most tasks. Regarding the travel question, I received essentially the same itinerary from both products, but Perplexity AI did a better job of formatting.

Photo: Khamosh Pathak.

Grock has gone above and beyond in recommending other options in south India and Perplexity has just provided additional questions. So, I have to give this one props.

Photo: Khamosh Pathak.

However, when it came to shopping research, Grok was wrong in recommending the best product. The product he suggested is simply not available in India where I live and the other options just don’t suit me.

Photo: Khamosh Pathak.

Meanwhile, Perplexity AI surprised me with its top pick that I didn’t know about and that ticks most of my boxes. His other variants were also interesting, and there was nothing in them that was not available in India. Both Grok and Perplexity did a good job of explaining what I should look for when buying a commuter bike, so equal points, but the latter was much more comfortable.

Photo: Khamosh Pathak.

Based on my testing, I feel like Perplexity AI still has an edge over Grok 3 when it comes to deep explorations that are actually useful to the average person. Whether it’s planning a trip, researching a purchase, or understanding news or concepts, Perplexity does the finer work. When it comes to speed, Grok is faster and isn’t afraid to provide links within the text itself, but in Perplexity, clicking on linked text actually expands the topic of the report.

Perplexity also has more export options. You can download your report as a PDF, in Markdown, or create a public page ( here’s my urban cycle study report if you’re interested ). In Grok, all you can do is copy text.

What does this all mean? Well, while Grok can certainly be used, it’s a little disappointing that its paid offering can’t compete with its free alternative. I feel like I deal with this all the time.

Grok 3 is not worth the money.

Right now we’re in the middle of the hype cycle for Grok 3. Grok 3 itself is improving every day, but as things stand, you don’t need to end and cancel your ChatGPT Plus or Perplexity Pro subscriptions. Grok is good in many ways, but not that good .

If you want, you can try Grok 3 for free temporarily, as X allows limited free access until its servers can handle the load. When will this period end? Who knows. According to Musk X’s account, it will only be free ” for a while .”

Additionally, aside from the model’s performance, Grok 3 also lacks some of the features of a more established AI application. There is no voice mode, and all you have access to right now is the full model of the Grok 3. The faster Grok 3 mini has not yet been released, and there is no API for the Grok 3 either.

When you factor in the cost of full access, Grok 3 becomes even less reasonable. The $40 per month for the X Premium+ plan is double the industry standard of $20 for Gemini Advanced, ChatGPT Plus and Perplexity Pro. And once that free trial ends, the pricey X Premium+ plan will be the only way to access Grok 3 until the $30 SuperGrok subscription becomes available to everyone (the SuperGrok plan only gives you access to Grok 3, but doesn’t give you any of X’s premium features).

And as it stands, you won’t get double the money. In fact, in many cases, you can use a free model like the DeepSeek R1 instead (though you may be more comfortable using it through a third-party app ).

More…

I Tested the Grok 3 and It’s Not Worth the Price Increase

Grok 3 has technically caught up

Grock isn’t “based on” he’s actually pretty boring

How deep is your search?

Grok 3 is not worth the money.

Leave a Reply Cancel reply