GPT-5 Thinking in ChatGPT (a.k.a. Research Goblin) is good at search

220 comments

·September 6, 2025

Related: Google's new AI mode is good, actually - https://news.ycombinator.com/item?id=45158586 - Sept 2025 (31 comments)

Visit

softwaredoug

I agree with Simon’s article but I usually think about “research” to mean comparing different kinds of evidence (not just the search part). Like evidence for the effectiveness of Obamacare. Or how some legal case may play out in the courts. Or how much The Critic influenced The Family Guy. Or even what the best way to use X feature of Y library.

I’ve found ChatGPT and other LLMS can struggle to evaluate evidence - to understand the biases behind sources - ie taking data from a sketchy think tank as gospel. I also have found in my work the more reasoning, the more hallucination. Especially when gathering many statistics.

That plus the usual sycophancy can cause the model to really want to find evidence to support your position. Even if you don’t think you’re asking a leading question, it can really want to answer your question in the affirmative.

I always ask ChatGPT do directly cite and evaluate sources. And try to get it in the mindset of comparing and contrasting arguments for and against. And I find I must argue against its points to see how it reacts.

More here https://softwaredoug.com/blog/2025/08/19/researching-with-ag...

thom

Yeah trying to make well-researched buying decisions for example is really hard because you'll just quite a lot of opinions dominated by marketing material, which aren't well counterbalanced by the sort of angry Reddit posts or YouTube comments I'd often treat as red flags.

NothingAboutAny

I tried to use perplexity to find ideal settings for my monitor, it responded with concise list of distinct settings and why. When I investigated the source it was just people guessing and arguing with each other in the Samsung forums, no official or even backed up information.

I'd love if it had a confidence rating based on the sources it found or something, but I imagine that would be really difficult to get right.

Moosdijk

I asked gemini to do a deep research on the role of healthcare insurance companies in the decline of general practicioners in the Netherlands. It based its premise mostly on blogs and whitepapers on company websites, who's job it is to sell automation-software.

AI really needs better source-validation. Not just to combat the hallucination of sources (which gemini seems to do 80% of the time), but also to combat low quality sources that happen to correlate well to the question in the prompt.

It's similar to Google having to fight SEO spam blogs, they now need to do the same in the output of their models.

simonw

Better source validation is one of the main reasons I'm excited about GPT-5 Thinking for this. It would be interesting to try your Gemini prompts against that and see how the results compare.

wodenokoto

But the really tricky thing is, that sometimes it _is_ these kinds of forums where you find the best stuff.

When LLMs really started to show themselves, there was a big debate about what is truth, with even HN joining in on heated debates on the number of sexes or genders a dog may have and if it was okay or not for ChatGPT to respond with a binary answer.

On one hand, I did found those discussions insufferable, but the deeper question - what is truth and how do we automated the extraction of truth from corpora - is super important and somehow completely disappeared from the LLM discourse.

simonw

It would be interesting to see if that same question against GPT-5 Thinking produces notably better results.

gonzobonzo

> I’ve found ChatGPT and other LLMS can struggle to evaluate evidence - to understand the biases behind sources - ie taking data from a sketchy think tank as gospel.

This is what I keep finding, it mostly repeats surface level "common knowledge." It usually take a few back and forths to get to whether or not something is actually true - asking for the numbers, asking for the sources, asking for the excerpt from the sources where they actually provide that information, verifying to make sure it's not hallucinating, etc. A lot of the time, it turns out its initial response was completely wrong.

I imagine most people just take the initial (often wrong) response at face value, though, especially since it tends to repeat what most people already believe.

athrowaway3z

> It usually take a few back and forths to get to whether or not something is actually true

This cuts both ways. I have yet to find an opinion or fact I could not make chatgpt agree with as if objectivly true. Knowing how to trigger (im)partial thought is a skill in and of itself and something we need to be teaching in school asap. (Which some already are in 1 way or another)

gonzobonzo

I'm not sure teaching it in school is actually going to help. Most people will tell you that of course you need to look at primary sources to verify claims - and then turn around and believe the first thing they here from LLM, Redditor, Wiki article, etc. Even worse, many people get openly hostile to the idea that people should verify claims - "what, you don't believe me?"/"everyone here has been telling you this is true, do you have any evidence it isn't?"/"oh, so you think you know better?"

There was a recent discussion about Wikipedia here recently where a lot of people who are active on the site argued against people taking the claims there with a grain of salt and verifying the accuracy for themselves.

We can teach these things until the cows come home, but it's not going to make a difference if people say it's a good idea and then immediately do the opposite.

eru

> Knowing how to trigger (im)partial thought is a skill in and of itself and something we need to be teaching in school asap.

You are very optimistic.

Look at all other skills we are trying to teach in school. 'Critical thinking' has been at the top of nearly every curriculum you can point a finger at for quite a while now. To minimal effect.

Or just look at how much math we are trying to teach the kids, and what they actually retain.

killerstorm

FWIW GPT-5 (and o3, etc.) is one of the most critical-minded LLMs out there.

If you ask for information which is e.g. academic or technical it would cite information and compare different results, etc, without any extra prompt or reminder.

Grok 4 (at the initial release) was just reporting information in the articles it found without any analysis.

Claude Opus 4 also seems bad: I asked it to give a list of JS libraries of a certain kind in deep research mode, and it returned a document focused on market share and usage statistics. Looks like it stumbled upon some articles of that kind and got carried away by it. Quite bizarre.

So GPT-5 is really good in comparison. Maybe not perfect in all situations, but perhaps better than an average human

eru

> So GPT-5 is really good in comparison. Maybe not perfect in all situations, but perhaps better than an average human

Alas, the average human is pretty bad at these things.

btmiller

How are we feeling about the usage of the word research to indicate feature sets in LLMs? Is it truly representative of research? How does it compare to the colloquial “do your research” refrain used often during US election years?

softwaredoug

Well I will just need to start saying “critical thinking”? Or some other term?

I have a liberal arts background. So I use the term research to mean gathering evidence, evaluating its trustworthiness and biases, and avoiding related thinking errors related to evaluating evidence (https://thedecisionlab.com/biases).

LLMs can fall prey to these problems as well. Usually it’s not just “reasoning” that gives you trouble. It’s the reasoning about evidence. I see this with Claude Code a lot. It can sometimes create some weird code, hallucinating functionality that doesn’t exist, all because it found a random forum post.

I realize though that the term is pretty overloaded :)

vancroft

> I always ask ChatGPT do directly cite and evaluate sources. And try to get it in the mindset of comparing and contrasting arguments for and against. And I find I must argue against its points to see how it reacts.

Same here. But it often produces broken or bogus links.

wer232essf

You make a great point “research” isn’t just searching but weighing different kinds of evidence and understanding the biases behind them. I agree LLMs often fall short here, especially with statistics or nuanced reasoning, where they can hallucinate or lean too hard into confirmation. I’ve also seen the sycophancy effect you mention the model tends to agree with whatever frame it’s given. Asking for direct citations and then challenging the model’s arguments, like you do, seems like a smart way to push it toward more balanced and critical responses.

lambda

I guess the part where I'm still skeptical are: Google is also still pretty good at search (especially if I avoid the AI summary with udm=14).

I'll take one of your examples: Britannica to seed Wikipedia. I searched for "wikipedia encyclopedia brtannica". In less than 1 second, I got search results back.

I spend maybe 30 seconds scanning the page; past the Wikipedia article on Encyclopedia Britannica, past the Encyclopedia article about Wikipedia, past a Reddit thread comparing them, past the Simple English Wikipedia article on Britannica, and past the Britannica article on Wiki. OK, there it is, the link to "Wikipedia:WikiProject Encyclopaedia Britannica", that answers your question.

Then to answer your follow up, I spend a couple more seconds to search Wikipedia for Wikipedia, and find in the first paragraph that it was founded in 2001.

So, let's say a grand total of 60 seconds of me searching, skimming, and reading the results. The actual searching was maybe 2 or 3 seconds of time total, once on Google, and once on Wikipedia.

Compared to nearly 3 minutes for ChatGPT to grind through all of that, plus the time for you to read it, and hopefully verify by checking its references because it can still hallucinate.

And what did you pay for the privilege of doing that? How much extra energy did you burn for this less efficient response? I wish that when linking to chat transcripts like you do, ChatGPT would show you the token cost of that particular chat

So yeah, it's possible to do search with ChatGPT. But it seems like it's slower and less efficient than searching and skimming yourself, at least for this query.

That's generally been my impression of LLMs; it's impressive that they can do X. But when you add up all the overhead of asking them to do X, having them reason about it, checking their results, following up, and dealing with the consequences of any mistakes, the alternative of just relying on plain old search and your own skimming seems much more efficient.

animal531

I'm going to somewhat disagree based on my recent attempts.

Firstly, if we don't remove the Google AI summary then as you rightly say, it makes the experience 10x worse. They try to still give an answer quickly, but the AI takes up a ton of space and is mostly terrible.

Googling for a Github repository just now, Google linked me to 3 resources except the actual page. One clone that was named the same, another garbage link but luckily the 3rd was a reddit post by the same person which linked to the correct page.

GPT does take a lot longer, but the main advantage for me comes in depending on the scope of what you're looking for. In the above example I didn't mind Google, because the 3 links opened fast and I could scan and click through to find what I was looking for, ie. I wanted the information right now.

But then let's say I'm interested in something a bit deeper, for example how did they do the unit movement in StarCraft 2? This is a well known question, so the links/info you get from either Google or GPT are all great. If I was searching this topic via Google I'd then have to copy or bookmark the main topics to continue my research on them. Doing it via GPT it returns the same main items, but I can very easily tell it to explain all those topics in turn, have it take the notes, find source code, etc.

Of course as in your example, if you're a Doctor and you're googling symptoms or perhaps real world location of ABC then the hallucination specter is a dangerous thing which you want to avoid at all costs. But for myself I find that I can as easily filter LLM mistakes as I can noise/errors from manual searches.

My future Internet guess is going to be that in N years there will be no such thing as manually searching for anything, everything will be assistant driven via LLM.

plopilop

Agree. I tried the first 3 examples:

* "Rubber bouncy at Heathrow removal" on Google had 3 links, including the one about SFO from which chatGPT took a tangent. While ChatGPT provided evidence for the latest removal date being of 2024, none was provided for the lower bound. I saw no date online either. Was this a hallucination?

* A reverse image lookup of the building gave me the blog entry, but also an Alamy picture of the Blade (admittedly this result can have been biased by the fact the author already identified the building as the blade)

* The starbucks pop Google search led me to https://starbuckmenu.uk/starbucks-cake-pop-prices/. I will add that the author bitching to ChatGPT about ChatGPT hidden prompts in the transcript is hilarious.

I get why people prefer ChatGPT. It will do all the boring work of curating the internet for you, to privde you with a single answer. It will also hallucinate every now and then but that seems to be a price people are willing to pay and ignore, just like the added cost compared to a single Google search. Now I am not sure how this will evolve.

Back in the days, people would tell you to be weary of the Internet and that Wikipedia thing, and that you could get all the info you need from a much more reliable source at the library anyways, for a fraction of the cost. I guess that if LLMs continue to evolve, we will face the same paradigm shift.

IanCal

As a counterpoint I asked that simple question to gpt5 in auto mode and it started replying in two seconds, wrote fast enough for me to scan the answer and gave me two solid links to read after.

With thinking it took longer (just shy of two minutes) but compared a variety of different sources, and comes back with numbers and each statement in the summary sourced.

I’ve used gpt a bunch for finding things like bin information on the council site that I just couldn’t easily find myself. I’ve also sent it off to dig through prs, specs and more for matrix where it found the features and experimental flags required to solve a problem I had. Reading that many proposals and checking what’s been accepted is a massive pain and it solved this while I went to make a coffee.

simonw

I suggest trying that experiment again but picking the hardest of my examples to answer with Google, not the easiest.

dwayne_dibley

I wonder how all this will really change the web. In your manual mode, you a human, are viewing and visiting webpages, but if one never needs to and always interacts with the web through an agent, what does the web need to look like, and will people even bother making websites? Interesting times ahead.

gitmagic

I’ve been thinking about this as well. Instead of making websites, maybe people will make something else, like some future version of MCP tools/servers? E.g. a restaurant could have an “MCP tool” for checking opening hours, reserving a table, etc.

rossant

Same. Websites won't disappear but may become niche or something of the past. Why create a new UI for your new service when you can plug into a "universal" personal agent AI.

diabllicseagull

I hope none of this happens and web stays readable and indexable.

utyop22

V nice post. Captures my sentiment too

wilg

First, you not having to spend the 60 seconds and it means you can parallelize it with something else to get the answer effectively instantly. Second, you're essentially establishing that if an LLM can get it done in less than 60 seconds its better than your manual approach, which is a huge win, as this will get faster!

sigmoid10

For real. This is what it must have been like living in the early 20th century and hearing people say they prefer a horse to get groceries because it is so much more effort to crank-start a car. I look forward to the age when we gleefully reminisce about the time we had to deal with SEO spam manually.

lomase

I look forward to the day AI hype is dead as blockchain.

Jordan-117

It really is great. When I was still on Reddit, I made regular use of the "Tip of My Tongue" sub to track down obscure stuff I half-remembered from years ago. It mostly worked, but there were a few stubborn cases that went unsolved, even after pouring every ounce of my Google Fu into the endeavor. I recently took the text of these unsolved posts and submitted them to Deep Research -- and within an hour, it had cracked four of them, and put me on track to find a fifth myself. Even if the reasoning part isn't entirely up to par, there's still something really powerful about being able to rapidly digest dozens of search results and pull out relevant information based on a loose description. And now I can have that kind of search power on demand in just a few minutes, without having to deal with Reddit's spambots and post filters and hordes of users who don't read the question or follow the sub's basic rules.

vahid4m

When it comes to Information Retrieval, you can get anything between links to existing documents or generated content based on those processed information. I agree that the second one is really powerfuly and just amazing and seemilngly useful. But the fact that it can also be wrong in more cases and I won't know keep being reminded to my using it for things I'm not good at and they just don't work s they should.

I just wish the business models could justify a confidence level being attached to the response.

larsiusprime

I find ChatGPT to be great at research too-but there are pathological failure modes where it is biased to shallow answers that are subtly wrong, even when definitive primary sources are readily available online:

https://www.fortressofdoors.com/researchers-beware-of-chatgp...

Helmut10001

More recently, I find ChatGPT to become increasingly unreliable. It makes up almost every second answer, forgets context, or is just downright wrong. Maybe I am used these days more and more to dump huge texts for context into the prompt, as aistudio allows me. Maybe ChatGPT isn't as good as with such information. Gemini/Aistudio will stay on track even with 300k tokens consumed, it just needs a little nudge here and there.

ants_everywhere

This isn't really how you described. You have an opinion that conflicts with the research literature. You published a blog about that opinion, and you want ChatGPT to say you're to accept your view.

Your view is grinding a political axe and I don't think you're in a position to objectively assess whether ChatGPT failed in this case.

eru

Hmm, I suspect if ChatGPT would pay more attention to the German sources, they would perhaps find that supposedly right answer?

I wonder if asking ChatGPT in German would make a difference.

larsiusprime

What are you talking about? There are verifiable primary sources that ChatGPT was not citing. There are direct primary historical sources that lay out the full budget of the historical German colony in extreme detail, that directly contradict assertions made in the Silagi paper, that’s not a matter of opinion that’s a matter of verifiable fact.

Also what “axe” am I grinding? The findings are specifically inconvenient for my political beliefs, not confirming my priors! My priors would be flattered if Silagi was correct about everything but the primary sources definitively prove he’s exaggerating.

> You published a blog about that opinion, and you want ChatGPT to say you're to accept your view.

False, and I address this multiple times in the piece. I don’t want ChatGPT to mindlessly agree with me, I want it to discover the primary source documents.

ants_everywhere

From your blog you appear to be a Georgist or inspired by Georgist socialism. And given that you appear to have a business and blog related to these subjects, you give the impression that you're a sort of activist for Georgism. I.e. not just researching it by trying to advance it.

So just zooming out, that's not the right sort of setup for being an impartial researcher. And in your blog post your disagreements come off to me as wanting a sort of purity with respect to Georgism that I wouldn't be expected to be reflected in the literature.

I like Kant, but it would be a bit like me saying ChatGPT was fundamentally wrong because it considered John Rawls a Kantian because I can point to this or that paper where he diverges from Kant. I could even write a blog post describing this and pointing to primary sources. But Rawls is considered a Kantian and for good reason, and it would (in my opinion) be misleading for me to say that ChatGPT made a big failure mode because it didn't take my view on my pet subject as seriously as I wanted.

typpilol

Yea this isn't really a chat gpt problem as a source credibility problem no?

larsiusprime

It’s mostly that it was not citing verifiable - and available online - primary source documents, the way I would expect an actual researcher investigating this question would. This is relevant when it is billed as "Research Grade" or "PhD" level intelligence. I expect a PhD level researcher to find the German-language primary sources.

jbm

Yes, this is very much my experience too.

Switching to GPT5 Thinking helps a little, but it often misses things that it wouldn't when I was using o3 or o1.

As an example, I asked it if there were any incidents involving Botchan in an Onsen. This is a text that is readily available and must have been trained on; in the book, Botchan goes swimming in the onsen, and then is humiliated when the next time he comes back, there is a sign saying "No swimming in the Onsen".

According to GPT5 it gives me this, which is subtly wrong.

> In the novel, when Botchan goes to Dōgo Onsen, he notes the posted rules of the bath. One of them forbids things like: > “No swimming in the bath.” (泳ぐべからず) > “No roughhousing / rowdy behavior.” (無闇に騒ぐべからず) > Botchan finds these signs funny because he’s exactly the sort of hot-headed, restless character who might be tempted to splash around or make noise. He jokes in his narration that it seems as though the rules were written specifically to keep people like him out.

Incidentally, Dogo Onsen still has the "No swimming sign", or it did when I went 10 years ago.

black_knight

I feel like the value of my plus subscription went down when they released GPT-5, it feels like a downgrade from o3. But of course OpenAI being not open, there is no way for me to know now.

simianwords

I found your article interesting and it is relevant to the discussion. To be honest, while I think GPT could have performed better here, I think there is something to be said about this:

There is value in pruning the search tree because the deeper nodes are usually not reputable. I know you have cause to believe that "Wilhelm Matzat" is reputable but I don't think it can be assumed generally. If you were to force GPT to blindly accept counter points from people - the debate would never end. And there has to be a pruning point at which GPT would accept this tradeoff: maybe the less reputable or well known sources may have a correct point at the cost of being incorrect more often due to taking an incorrect analysis from a not well known source.

You could go infinitely deep into any analysis and you will always have seemingly correct points on both sides. I think it is valid for GPT to prune the search at a point where it converges to what society at large believes. I'm okay with this tradeoff.

psadri

I do miss the earlier "heavy" models that had encyclopedic knowledge vs the new "lighter" models that rely on web search. Relying on web search surfaces a shallow layer of knowledge (thanks to SEO and all the other challenges of ranking web results) vs having ingested / memorized basically the entirety of human written knowledge beyond what's typically reachable within the first 10 results of a web search (eg: digitized offline libraries).

hamdingers

I feel the opposite. Before I can use information from a model's "internal" knowledge I have to engage in independent research to verify that it's not a hallucination.

Having an LLM generate search strings and then summarize the results does that research up front and automatically, I need only click the sources to verify. Kagi Assistant does this really well.

beefnugs

So does anyone have any good examples of it effectively avoiding the blogspam and SEO? Or being fooled by it? How often either way?

15123123aa

I find one thing it doesn't do very well is avoiding marketing articles pushed by a brand itself. e.g. if I search is X better than Y, very likely landing on articles by makers of brand X and Y and not a 3rd party reviewer. When I manually search on Google I can spot marketing articles just by the URL.

simonw

Here's a good article about Google AI mode usually managing to spot and avoid social media misinformation but occasionally falling for it: https://open.substack.com/pub/mikecaulfield/p/is-the-llm-res...

mastercheif

I kept search off for a long time due to it tanking the quality of the responses from ChatGPT.

I recently added the following to my custom instructions to get the best of both worlds:

# Modes

When the user enters the following strings you should follow the following mode instructions:

1. "xz": Use the web tool as needed when developing your answer.

2. "xx": Exclusively use your own knowledge instead of searching the internet.

By default use mode "xz". The user can switch between modes during a chat session. Stay with the current mode until the user explicitly switches modes.

simianwords

There is a tradeoff here: the non search models are internally heavy but the search models are light but also depend on real data.

I keep switching between both but I think I'm starting to prefer the lighter one that is based on the sources instead.

killerstorm

These models are still available: GPT-4.5, Gemini 2.5 Pro (at least the initial version - not sure if they optimized it away).

From what I can tell, they are pretty damn big.

Grok 4 is quite large too.

stephen_cagle

I think this is partially something I have felt myself as well. It would be interesting if these lighter web search models would highlight the distinction between information that has been seen elsehwere vs information that is novel for each page? Like, a view that lets me look at the things that have been asserted and see how many of the different pages show those facts asserted (vs unmentioned vs contradicted).

ants_everywhere

Most real knowledge is stored outside the head, so intelligent agents can't rely solely on what they've remembered. That's why libraries are so fundamental to universities.

gerdesj

"encyclopedic knowledge"

Have you just hallucinated that?

mritchie712

I was curious how much revenue a podcast I listen to makes. The podcast was started by two local comedians from Phoenix, AZ. They had no following when they started and were both in their late 30's. The odds were stacked against them, but they rank pretty high now on the the Apple charts now.

I looked into years ago and couldn't find a satisfying answer, but GPT-5 went off, did an "unreasonable" amount of research, cross referenced sources and provided an incredibly detailed answer and a believable range.

creesch

> an incredibly detailed answer and a believable range.

Recently, I started returning even more verbose answers. The absolute bullshit research paper that Google Gemini gives you is what turned me away from using it there. Now, chatGPT also seems to go for more verbose filler rather than actually information. It is not as bad as Gemini, but I did notice.

It makes me wonder if people think the results are more credible with verbose reports like that. Even if it actually obfuscates the information you asked it to track down to begin with.

I do like how you worded it as a believable range, rather than an accurate one. One of the things that makes me hesitant to use deep research for anything but low impact non-critical stuff is exactly that. Some answers are easier to verify than others, but the way the sources are presented, it isn't always easy to verify the answers.

Another aspect is that my own skills in researching things are pretty good, if I may so myself. I don't want them to atrophy, which easily happens with lazy use of LLMs where they do all the hard work.

A final consideration came from me experimenting with MCPs and my own attempts at creating a deep research I could use with any model. No matter the approach I tried, it is extremely heavy on resources and will burn through tokens like no other.

Economically, it just doesn't make sense for me to run against APIs. Which in my mind means it is heavily subsidized by openAI as a sort of loss-leader. Something I don't want to depend on just to find myself facing a price hike in the future.

d4rkp4ttern

Has Deep Research been removed? I have a Pro subscription and just today noticed Deep Research is no longer shown as an option. In any case I’ve found using GPT-5 Thinking, and especially GPT-5 Pro with web search more useful than DR used to be.

indigodaddy

Pretty wild! I wonder how much high school teachers and college professors are struggling with the inevitable usage though?

"Do deep internet research and thinking to present as much evidence in favor of the idea that JRR Tolkein's Lord of the Rings trilogy was inspired by Mervyn Peake's Gormenghast series."

https://chatgpt.com/share/68bcd796-bf8c-800c-ad7a-51387b1e53...

sixtyj

Did you check the facts? Did you click through all the links and see what the sources are?

A while ago I bragged at a conference about how ChatGPT had "solved" something... Yeah, we know, it's from Wikipedia and it's wrong :)

currymj

the thing about students who cheat is most of them are (at least in the context of schoolwork) very lazy and don't care if their work is high quality. i would guess waiting multiple minutes for Thinking mode to give thorough results is very unappealing. 4o or 4o-mini was already good enough for their purposes.

esafak

I was amused that it used the neologism 'steel-man' -- redundantly, too.

wtbdbrrr

Idea: workshops for teachers that teach them some kind of Socratic method that stimulates kids to support what they got from G with their own thinking, however basic and simple it may be.

Formulating the state of your current knowledge graph, that was just amplified by ChatGPT's research might be a way to offset the loss of XP ... XP that comes with grinding at whatever level kids currently find themselves ...

meshugaas

These answers take a shockingly long time to resolve considering you can put the questions into Brave search and get basically the same answers in seconds.

apparent

I like Brave but have found their search to be awful. The AI stuff seems decent enough, but the results populated below are just never what I'm looking for.

ignoramous

The thing is, with Chat+Search you don't have to click various links, sift through content farms, or be subject to ads and/or accidental malware download.

dns_snek

In practice this means that you get the same content farm answer dressed up as a trustworthy answer without even getting the opportunity to exercise better judgement. God help you if you rely on them for questions about branded products, they happily rephrase the company's marketing materials as facts.

Pepe1vo

A counter example to this is that I asked it about NovaMin® 5 minutes ago and it essentially told me to not bother and buy whatever toothpaste has >1450 ppm fluoride.

ekianjo

With the walls of low quality sites optimized for SEO these days? Call me unconvinced

milanhbs

I agree, I've found it very useful for search tasks that involve putting a few pieces together. For example, I was looking for the floor plan of an apartment building I was considering moving to in a country I'm not that familiar with. Google: Found floor plans on architect's sites - but only ones of rejected proposals.

ChatGPT: Gave me the planning proposal number of the final design, with instructions to the government website where I can plug that in to get floor plans and other docs.

In this case, ChatGPT was so much better at giving me a definitive source than Google - instead of the other way around.

eru

About 'Britannica to seed Wikipedia': the German Wikipedia used Meyers Konversations-Lexikon https://en.wikipedia.org/wiki/Meyers_Konversations-Lexikon

j_bum

Is this the “Web Search”, “Deep Research”, or “Agent Mode” feature of ChatGPT?

Navigating their feature set is… fun.

simonw

It's not the Deep Search or Agent Mode.

I select "GPT-5 Thinking" from the model picker and make sure its regular search tool is enabled.

j_bum

Good to know, I’ll try to just use this a bit more then. I always opt for one of the above modes, with varying degrees of success.

Not sure if you tend to edit your posts, but it could be worth clarifying.

Btw — my colleagues and I all love your posts. I’ll quit fanboying now lol.

jonahx

> This is excellent for satisfying curiosity, and occasionally useful for more important endeavors as well.

Small nit, Simon: satisfying curiosity is the important endeavor.

dclowd9901

It feels like the difference between someone painstakingly brushing away eons of dirt and calcification on buried dino bones vs just picking them up off the ground.

In the former, the research feels genuine and in the latter it feels hollow and probably fake.

650REDHAIR

In my experience it’s “search Reddit and combine comments”.

dontdoxxme

There are searches where that is the best way for a human to get the answer too. It can also search the Internet Archive if you ask for historical details, so does it not just do what a good human researcher would do?

movedx01

Don't forget about the "ChatGPT 5 Pro" too :) which is a bit like Deep Research but not quite?

iguana2000

I believe this is just the normal mode. In my experience, you don't have to select the web search option to make it search the web. I wonder why they have web search as an option at this point (to force the llm to search?)

yunohn

I have a feeling this is just ChatGPT 5 in thinking mode, with web search enabled at a profile level at least. Even without that, any indication for recent data or research and thinking will prompt it to think+research quite a bit, ie deep research.