TWITTER_ARTICLE

Anthropic’s Sonnet 4.6 introduced “dynamic filtering” for web search, where…

Brief

Tom Crawshaw argues that the most important part of Anthropic’s Sonnet 4.6 release is not benchmark performance or the new 1 million-token context window, but a quieter web-search upgrade called dynamic filtering. Instead of forcing Claude to reason over raw search-result HTML filled with navigation elements, ads, and cookie notices, Sonnet 4.6 now generates and runs Python code to clean and filter the retrieved pages before analysis. According to Anthropic’s cited tests, this preprocessing step materially improves web-agent accuracy while reducing cost: Sonnet rose from 33.3% to 46.6% on BrowseComp and from 52.6% to 59.4% on DeepsearchQA, while average token usage dropped 24%; Opus also improved strongly on both benchmarks. Crawshaw frames this as especially important for automation builders using platforms like n8n, because the token savings compound across repeated runs. He also notes that Sonnet 4.6 is now free by default, exposes a 1 million-token context window in beta, and ships production-ready code execution and memory tools.

Why it matters

Anthropic’s Sonnet 4.6 introduced “dynamic filtering” for web search, where Claude writes and executes Python to preprocess search results, removing irrelevant HTML such as headers, sidebars, cookie banners, and ads before reasoning over the content.

Key details

  • Anthropic reported that dynamic filtering improved search-agent performance on BrowseComp from 33.3% to 46.6% for Sonnet and from 45.3% to 61.6% for Opus, while average token usage fell by 24%.
  • On DeepsearchQA, Sonnet improved from 52.6% to 59.4% and Opus from 69.8% to 77.3%, suggesting the filtering step boosts both retrieval quality and answer completeness on multi-site research tasks.
  • The post also highlights broader Sonnet 4.6 changes: it became the default free Claude model, added a 1 million-token context window in beta for API users in usage tier 4 with a beta header, and made code execution and memory tools generally available.
  • Tom Crawshaw argues the practical upgrade is cost efficiency rather than headline benchmarks, claiming users running web-search agents on Sonnet 4.5 are getting worse results while paying materially more than they would on 4.6.
Cleaned source text

title: @tomcrawshaw01: Not the benchmarks. Not the 1M context window. A quiet update called dynamic fil...

author: tomcrawshaw01

content_type: twitter_article

published: 2026-02-19T00:08:06+00:00

source_url: https://x.com/tomcrawshaw01/status/2024274757789897184

word_count: 917

Not the benchmarks. Not the 1M context window. A quiet update called dynamic filtering just made eve

Not the benchmarks. Not the 1M context window. A quiet update called dynamic filtering just made every AI agent workflow cheaper to run.

Anthropic dropped Sonnet 4.6 yesterday.

Within an hour, your timeline was flooded. Screenshots of benchmark charts. Side-by-side comparisons with GPT-5.2. The usual cycle.

And look, the model is genuinely impressive. Preferred over the $150/million-token Opus by 59% of developers. Free for everyone. A million-token context window that can swallow an entire codebase in one shot (currently in beta).

But that's what everyone is already covering.

I want to talk about the thing almost nobody is covering. Because it might matter more than any of those benchmarks if you're building AI automations in 2026.

Anthropic published a second blog post that got almost zero attention

On the same day they announced Sonnet 4.6, Anthropic quietly published a separate post about their web search tools.

No flashy benchmarks. No comparison charts. Just a technical update about how Claude handles search results now.

Most people scrolled right past it.

That was a mistake.

You can read the full dynamic filtering breakdown here: https://claude.com/blog/improved-web-search-with-dynamic-filtering

It's called dynamic filtering, and it changes how AI agents search the web

Here's the problem that every AI agent builder runs into.

You set up an agent. You give it web search. It goes out, pulls in results, and starts reasoning over the raw HTML from multiple websites. Headers, footers, navigation menus, cookie banners, ads. All of it crammed into the context window.

Your agent is now spending tokens reading junk. And worse, all that noise actually degrades the quality of the response. The signal gets buried in garbage.

This is what your AI agents have been doing every single time they search the web. You just didn't see it.

Claude now writes its own code to clean up search results before it reads them

That's what dynamic filtering is.

Before Sonnet 4.6, Claude would pull in raw search results and reason over all of it. Every irrelevant paragraph. Every sidebar. Every cookie notice.

Now Claude writes and executes Python to filter the results first. It strips out the noise, keeps only what's relevant, and then reasons over the clean data.

The model is writing its own preprocessing code on the fly. It decides what's relevant, throws away what isn't, and gives you a cleaner answer from a smaller context window.

No prompt engineering trick. No custom code you had to build. It just happens at the model level now.

The results aren't subtle. 11% more accurate. 24% fewer tokens.

Anthropic tested this across two benchmarks.

On BrowseComp, which tests whether an agent can dig through multiple websites to find a specific piece of information, Sonnet jumped from 33.3% to 46.6%. Opus went from 45.3% to 61.6%.

On DeepsearchQA, which tests whether an agent can systematically find every correct answer to a research query, the gains were just as clear. Sonnet's score went from 52.6% to 59.4%. Opus from 69.8% to 77.3%.

Oh, and token usage dropped by 24% on average. Same tasks. Better results. You're just paying less for them now.

If you're running AI agents that search the web inside n8n or any other platform, that 24% compounds across every single execution.

While you're here, don't sleep on these other Sonnet 4.6 updates

The dynamic filtering story is the one I wanted to make sure you saw. But there are a few other things from this release that operators should know about. You can read the full Sonnet 4.6 announcement here: https://www.anthropic.com/news/claude-sonnet-4-6

Sonnet 4.6 is now the default free model. You don't need a Pro plan to use it. Everyone gets it.

The context window is now 1 million tokens in beta. On the API, you'll need to be in usage tier 4 and pass a specific beta header to access it. But when it's fully rolled out, that's enough to hold entire documentation sets, full contracts, or dozens of research papers in a single request.

Computer use took a major leap. Early users are reporting human-level performance on tasks like navigating complex spreadsheets and filling out multi-step web forms across multiple browser tabs.

Code execution and memory tools are now generally available on the API. No more beta flags. These are production-ready.

Every one of these updates makes AI agents more capable. But dynamic filtering is the one that directly lowers your cost while improving your output. That's why it deserved its own spotlight.

If you're still running Sonnet 4.5, it's time to switch

This isn't one of those incremental updates you can ignore for a few months.

If you have AI agents that search the web, you're paying more and getting worse results on 4.5 than you would on 4.6. Not by a little. By 24%.

If you're building new automations, Sonnet 4.6 should be your default starting point. The cost math changed overnight.

And if you're not building AI agents yet, pay attention to the trajectory here. These tools are getting dramatically better, dramatically faster. The gap between people who are building with them and people who are watching from the sidelines gets wider every month.

I break down updates like this every week

What changed. What it actually means for operators. And how to use it in your workflows before everyone else catches on.

That's what The AI Operator's Playbook is for.

Join the newsletter and get it in your inbox every week.

Join The AI Operator's Playbook → https://learnn8nautomation.com/newsletter

Posted: 2026-02-19T00:08:06.000Z

Engagement: 871 likes, 57 retweets, 20 replies