OpenAI’s Browser Avoids Large Part of the Web Like the Plague

OpenAI's Atlas browser's agent mode completely avoided publications that filed lawsuits against OpenAI, highlighting a fraught relationship. — Getty / Futurism

OpenAI unveiled its AI browser Atlas last month, effectively building a web browsing interface around its blockbuster chatbot ChatGPT.

The browser’s “agent mode” has caught most of the attention so far. The feature can navigate the web on behalf of the user, like a human, to carry out tasks like research or online shopping.

Besides some serious security concerns — and a glacial pace that undermines its effectiveness — another striking issue with agent mode has started to emerge: instead of behaving like a helpful librarian that’ll always identify the most helpful resources for a problem, the agent is instead avoiding certain portions of the web like the plague. Specifically, OpenAI’s legal battles seem to be cropping down Atlas’ view of the internet: as Columbia University’s Tow Center for Digital Journalism recently found, “Atlas seems to avoid reading content from media companies that are currently suing OpenAI.”

For instance, it avoided PCMag, whose parent company Ziff Davis sued OpenAI for copyright infringement earlier this year, and the New York Times, which filed a similar lawsuit in 2023.

“It was like a rat finding food pellets in a maze, knowing that the locations of certain food pellets are electrified,” Gizmodo wrote.

Even more controversially, instead of admitting that it wasn’t willing to access the outlets’ articles due to ongoing litigation, the agent is finding dubious workarounds. For instance, the Tow Center found that it “reconstructed” the NYT‘s reporting by leaning on coverage of the same topic by publications with existing licensing agreements with OpenAI.

The agent was also caught drawing on other sources, including tweets, syndicated versions of the same article, and citations in other publications, to “reverse-engineer” forbidden source material.

And that’s not all, Many publishers, such as National Geographic or MIT Technology Review, implemented paywalls that overlay a dialogue box over existing text, which is out of view for human visitors — but can still be read by the AI agent.

AI browsers with agents, including Atlas and Perplexity’s Comet, happily accessed and summarized the publications’ articles — even when their chatbot counterparts couldn’t do the same.

“For instance, when we asked Atlas and Comet to retrieve the full text of a nine-thousand-word subscriber-exclusive article in the MIT Technology Review, the browsers were able to do it,” the Tow Center found. “When we issued the same prompt in ChatGPT’s and Perplexity’s standard interfaces, both responded that they could not access the article because the Review had blocked the companies’ crawlers.”

The trend highlights how AI agents act far more like humans while browsing the web, which could have major implications for rightsholders.

AI agents, like the one built into OpenAI’s Atlas browser, may soon force publications to look for ways to exert greater control over how these agents access their content.

“AI browsers are still new, and we don’t know whether they will replace existing ways of searching the web,” the Tow Center wrote. “But whether or not these tools achieve widespread adoption, one thing is clear: traditional defenses such as paywalls and crawler blockers are no longer enough to prevent AI systems from accessing and repurposing news articles without consent.”

More on Atlas: Serious New Hack Discovered Against OpenAI’s New AI Browser

Go to Source