Diving into ChatGPT-5's web.search() Function

Chris Green
Aug 9
4 min read

Updated: Oct 20

ChatGPT 5 has been launched and Josh from Profound was quick off the mark to help provide some more insights into how the search service works.

After seeing Charlie share potential System Prompts (I say “potential”, I’m still not convinced that we can get a super accurate level of information for this) for GPT5, I wanted to sit down and see what was going on in the Conversations event stream, is it any different from ChatGPT 4.1 and does this change anything for how we do our work?

Key questions/ Observations

This is SUPER early days and requires a lot more study, but my initial findings:

Studies to understand citation sources (which search services are used within SonicBerry) are more important than ever if we want to really understand which engines we need to be visible for.
Does SonicBerry simply aggregate, or does it re-rank results before handing them to GPT-5? If re-ranking exists, what signals are used? This is a big one.
In my early checks, I haven’t seen examples of fan-out, but I’m not saying they’re not present/visible.
Classifiers for where a) a search is needed and b) search complexity will really help us understand what tools/functions are used to facilitate the search (how many queries, layered searches etc). Large-scale studies here + tracking for changes over time will be super actionable for AI search
Are there any differences in quality or comprehensiveness related to paid/free versions of SonicBerry that are used or is this just for tracking? Is the split provider access (e.g., Google/Bing API quotas) or index freshness? Could free users be hitting lower-rate API tiers, resulting in stale results?
If there is more clustering and cleaning of more search results (with fallbacks), we may see better accuracy of citations and fewer erroneous links included. Or are there better safeguards against “dangerous”/”legally questionable” content being surfaced? Maybe a combination there of.
Why/When are fallbacks required? What causes this, and what are the implications?
How much does search rely on meta data vs checking the actual pages themselves? I’m assuming that accurate meta data is still critical AND would still wager having content crawlable without JS is still important. But needs more investigation.
How often is "search" even evoked?

EDIT: Chris Long has tested almost 8.7k terms (albeit, high intent/transactional) and he saw 31% rate of web.search() being invoked - which is broadly consistent with my own testing (of a smaller 2k query subset). Read more here (on linkedin), but he also provided the following summary:

Total Number Of Searches: 2,648
Percentage Of Search Instances: 31%
Average Number Of Searches: 2.17
Average Words Per Query: 5.48

Details from the web.search()

Web Search Flow V2

How does this impact SEO and daily practice? When web search fires (and when it doesn't) really impacts how we should approach optimising for ChatGPT.

V2 created to help better-describe the actionable steps that can come from this and introduce some more outputs of investigations.

Flow V1 (previous one kept for reference)

Here's the (previous) flow to summarise and maintain for reference:

More detailed notes behind each section.

Deciding when to search

Sonic Classifier (e.g. sonic_classifier_3cls_ev3) decides if search is needed.
Example: “What’s the weather in London?” → search_prob = 0.91 → triggers search.
Possible GPT-5 change: More classifier details (search_complexity, complex_search_prob) consistently visible in logs.

OpenAI Meta-Search Backend (SonicBerry)

SonicBerry is already documented as being the Meta-Search “middleman” between ChatGPT and the web - this isn’t JUST Google OR Bing, or even just a combination of the two.

Aggregates results from multiple licensed/public sources.Example: Model ID = current_sonicberry_paid (Plus user) vs. current_sonicberry_unpaid_oai (free user). Does paid/unpaid impact in the results/result quality? Need to investigate.
Not new for GPT-5 — tier depends on subscription, not model version.
When probed, ChatGPT5 suggests it could include Google, Bing and other services - could be pure speculation though.

Structured Results

JSON results with URL, title, snippet, attribution, sometimes pub date.
GPT-5 appears to use search_result_groups (e.g., all TechRadar links grouped together).

Content Retrieval

May open URLs (web.open_url) to read article text if needed
Example: Fetches full HTML of “https://www.bbc.com/news/...”.

Citation Framework

Some hints here as to potential improvements to how content is cited:

Links claims to sources.
GPT-5 uses grouped_webpages, safe_urls, fallback_items, and maps citations to specific text spans.
Example: “Wix introduces AI Visibility Overview¹” → ¹TechRadar.

Moderation

Possibly some more rigour in citations too - is “safe” another word for “accurate” OR “not likely to get OpenAI into legal troubles”?

Could it be that all URLs are checked before display? Safe_urls suggests this.
GPT-5 Moderation events appear to be logged.

Debug & Metadata

debug_sonic_thread_id links all search steps in a conversation.
GPT-5 shows more client metadata (search_source = "composer_auto").

Here’s my initial dive, have you seen anything different? Do you agree/disagree with what I’ve summarised here? I’d love to hear your thoughts!