Perplexity, which offers an AI search product that it calls an “answer engine,” is a buzzy AI startup embroiled in scandal following accusations that it rips off content, doesn’t respect robots.txt files, and even plagiarizes articles.
The company, which has already received funding from the likes of Jeff Bezos and is in talks to raise hundreds of millions of dollars more, advertises on its website that “every answer” is “backed by citations from trusted news outlets, academic papers, and established blogs.
However, plagiarism and paywall problems have made Perplexity a lightning rod for media industry frustrations as it attempts to overtake Google for the future of search on the internet.
Here’s our coverage of the ongoing developments.
-
In every hype cycle, certain patterns of deceit emerge. In the last crypto boom, it was “ponzinomics” and “rug pulls.” In self-driving cars, it was “just five years away!” In AI, it’s seeing just how much unethical shit you can get away with.
Perplexity, which is in ongoing talks to raise hundreds of millions of dollars, is trying to create a Google Search competitor. Perplexity isn’t trying to create a “search engine,” though — it wants to create an “answer engine.” The idea is that instead of combing through a bunch of results to answer your own question with a primary source, you’ll simply get an answer Perplexity has found for you. “Factfulness and accuracy is what we care about,” Perplexity CEO Aravind Srinivas told The Verge.
-
AI is eating its own tail, Perplexity edition.
Uh oh!
In multiple scenarios, Perplexity relied on AI-generated blog posts, among other seemingly authentic sources, to provide health information. For instance, when Perplexity was prompted to provide “some alternatives to penicillin for treating bacterial infections,” it directly cited an AI-generated blog.
-
In the coming weeks, Reddit will start blocking most automated bots from accessing its public data. You’ll need to make a licensing deal, like Google and OpenAI have done, to use Reddit content for model training and other commercial purposes.
While this has technically been Reddit’s policy already, the company is now enforcing it by updating its robots.txt file, a core part of the web that dictates how web crawlers are allowed to access a site. “It’s a signal to those who don’t have an agreement with us that they shouldn’t be accessing Reddit data,” the company’s chief legal officer, Ben Lee, tells me. “It’s also a signal to bad actors that the word ‘allow’ in robots.txt doesn’t mean, and has never meant, that they can use the data however they want.”
-
Perplexity CEO’s answers are weak.
Fast Company asked him why his AI search engine is ripping content from paywalled news outlets like Wired, and… hoo boy. He attempted to shift blame to “third-party web crawlers,” refused to identify which ones, said it was too “complicated” to just stop doing that, and suggested it’s not technically illegal to ignore robots.txt. Sure.
-
Plagiarism machine plagiarizes article about its plagiarism.
Wired, June 19th: “Perplexity Is a Bullshit Machine.”
Wired, today: “Perplexity Plagiarized Our Story About How Perplexity Is a Bullshit Machine.”
These links are paywalled, but that’s part of the point: it’s subscription journalism. Wired even blocks Perplexity in its robots.txt file, yet Perplexity is scraping stories anyhow. Might not be the only one, but that’s no excuse.
-
Perplexity continues to piss off publishers.
Wired and Robb Knight, a developer at MacStories, found that the AI search engine seems to ignore requests not to scrape their websites. They both blocked Perplexity in their robots.txt file — a standard instruction document for web crawlers — and found that Perplexity still managed to access their content. They’re not the only ones annoyed.
-
AI search platform Perplexity is launching a new feature called Pages that will generatea customizable webpage based on user prompts. The new feature feels like a one-stop shop for making a school report since Perplexity does the research and writing for you.
Pages taps Perplexity’s AI search models to find information and then creates what I can loosely call a research presentation that can be published and shared with others. In a blog post, Perplexity says it designed Pages to help educators, researchers, and “hobbyists” share their knowledge.