The time has come! Yes, this week the Macalope will answer all questions about AI! All of them! Prepare to be dazzled and informed!
(Disclaimer: not a guarantee. Not all questions will be answered. Dazzling will only occur in a small number of instances. Information is fleeting. Void where prohibited.)
If you don’t think AI is a broken technology, just remember that right now an AI is harvesting this very column to regurgitate back to someone in an authoritative manner as fact. If that doesn’t convince you it’s not to be trusted, nothing will.
Or, maybe this will: “Perplexity Is a Bullshit Machine.”
The aptly named Perplexity is another AI startup, one that ignores the specific settings of websites that ask not to be scraped and then takes that input and uses it to provide made-up answers to questions. The Macalope is not a lawyer, but some of them seem possibly libelous.
In one case, the text it generated falsely claimed that Wired had reported that a specific police officer in California had committed a crime.
Possibly libelous to both the officer and Wired! It’s a two-fer!
VC Investor: “What does your system do?”
Startup: “It is a perpetual libel machine.”
VC Investor: “Here is a blank check.”
All over the web, site owners are rushing to add various AI scraping gizmos to their robots.txt list in a vain effort to not have their content used without their consent. Several AI companies are simply ignoring these settings and taking the information anyway. People have said, well, they’re just doing the same thing Archive.org does. Archive.org ignores robots.txt in an effort to index everything. How can you be fine with that and not fine with what LLM companies are doing?
Because they are fundamentally different products.
“I want to provide a free service that will benefit everyone and help promote transparency on the internet.”
Fan-tastic. You go right ahead.
“I want to become a billionaire by selling the modern equivalent of a Magic 8 Ball and I plan to do it by using your work to fuel my accumulation of wealth.”
Uh, no.
Some have also said, “Well, none of this is illegal, so they can do what they want.” The Macalope supposes so, but he’s still going to hate it. Is plagiarism illegal? No. But it’s still wrong. And it can get you fired because of it, which is what should happen to these AIs.
Fired into the sun, preferably.
It’s worth noting that this “open web” scraping is being used for different things. On the least egregious end, LLMs need a corpus of writing to learn how language works. This seems okay because we all use language. (This does get fuzzier, though when you ask an AI to write in a particular writer’s style.) Then there are the models such as those used in Apple’s Image Playground that need to learn how to draw. The Macalope’s opinion here is pretty subjective, but this feels a bit more like copying for some reason. Possibly because we are not all artists. Then, of course, there are the AI systems that, in response to a question, say “Oh, yeah, I read about that somewhere. Here’s an answer that may or may not be correct and I may or may not give attribution to the site I read it on.” That one’s definitely a problem.
If this whole gross spectrum of behavior seems familiar to you, it’s probably because AI shares a certain DNA with crypto, NFTs, and the blockchain in that they are all trendy, usually touted by people you wouldn’t want to be stuck in an elevator with and, in a weird coincidence, all happen to drive up both Nvidia’s stock price and worldwide temperature averages. The Macalope doesn’t consider himself someone prone to conspiracy theories, but he would not be surprised to find out years from now that Nvidia has been running a powerful psychological ops campaign that dreams up technologies that require its boards to run and then convince venture capital firms to invest in them.
Just sayin’.
Ideally, Apple wouldn’t be associating itself with these AI companies at all, but the Market has demanded it and at least it’s taking an arms-length approach when working with them. It is a sad fact that all Apple has to do to be one of the more ethical AI companies is simply honor sites’ robots.txt settings. But it’s not enough for the company to say you can opt-out now after it’s already availed itself of people’s hard work. Apple should not be following the “industry standard” practices here. It should be better than the industry.