Tumgik
#botshit
Text
The Coprophagic AI crisis
Tumblr media
I'm on tour with my new, nationally bestselling novel The Bezzle! Catch me in TORONTO on Mar 22, then with LAURA POITRAS in NYC on Mar 24, then Anaheim, and more!
Tumblr media
A key requirement for being a science fiction writer without losing your mind is the ability to distinguish between science fiction (futuristic thought experiments) and predictions. SF writers who lack this trait come to fancy themselves fortune-tellers who SEE! THE! FUTURE!
The thing is, sf writers cheat. We palm cards in order to set up pulp adventure stories that let us indulge our thought experiments. These palmed cards – say, faster-than-light drives or time-machines – are narrative devices, not scientifically grounded proposals.
Historically, the fact that some people – both writers and readers – couldn't tell the difference wasn't all that important, because people who fell prey to the sf-as-prophecy delusion didn't have the power to re-orient our society around their mistaken beliefs. But with the rise and rise of sf-obsessed tech billionaires who keep trying to invent the torment nexus, sf writers are starting to be more vocal about distinguishing between our made-up funny stories and predictions (AKA "cyberpunk is a warning, not a suggestion"):
https://www.antipope.org/charlie/blog-static/2023/11/dont-create-the-torment-nexus.html
In that spirit, I'd like to point to how one of sf's most frequently palmed cards has become a commonplace of the AI crowd. That sleight of hand is: "add enough compute and the computer will wake up." This is a shopworn cliche of sf, the idea that once a computer matches the human brain for "complexity" or "power" (or some other simple-seeming but profoundly nebulous metric), the computer will become conscious. Think of "Mike" in Heinlein's *The Moon Is a Harsh Mistress":
https://en.wikipedia.org/wiki/The_Moon_Is_a_Harsh_Mistress#Plot
For people inflating the current AI hype bubble, this idea that making the AI "more powerful" will correct its defects is key. Whenever an AI "hallucinates" in a way that seems to disqualify it from the high-value applications that justify the torrent of investment in the field, boosters say, "Sure, the AI isn't good enough…yet. But once we shovel an order of magnitude more training data into the hopper, we'll solve that, because (as everyone knows) making the computer 'more powerful' solves the AI problem":
https://locusmag.com/2023/12/commentary-cory-doctorow-what-kind-of-bubble-is-ai/
As the lawyers say, this "cites facts not in evidence." But let's stipulate that it's true for a moment. If all we need to make the AI better is more training data, is that something we can count on? Consider the problem of "botshit," Andre Spicer and co's very useful coinage describing "inaccurate or fabricated content" shat out at scale by AIs:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4678265
"Botshit" was coined last December, but the internet is already drowning in it. Desperate people, confronted with an economy modeled on a high-speed game of musical chairs in which the opportunities for a decent livelihood grow ever scarcer, are being scammed into generating mountains of botshit in the hopes of securing the elusive "passive income":
https://pluralistic.net/2024/01/15/passive-income-brainworms/#four-hour-work-week
Botshit can be produced at a scale and velocity that beggars the imagination. Consider that Amazon has had to cap the number of self-published "books" an author can submit to a mere three books per day:
https://www.theguardian.com/books/2023/sep/20/amazon-restricts-authors-from-self-publishing-more-than-three-books-a-day-after-ai-concerns
As the web becomes an anaerobic lagoon for botshit, the quantum of human-generated "content" in any internet core sample is dwindling to homeopathic levels. Even sources considered to be nominally high-quality, from Cnet articles to legal briefs, are contaminated with botshit:
https://theconversation.com/ai-is-creating-fake-legal-cases-and-making-its-way-into-real-courtrooms-with-disastrous-results-225080
Ironically, AI companies are setting themselves up for this problem. Google and Microsoft's full-court press for "AI powered search" imagines a future for the web in which search-engines stop returning links to web-pages, and instead summarize their content. The question is, why the fuck would anyone write the web if the only "person" who can find what they write is an AI's crawler, which ingests the writing for its own training, but has no interest in steering readers to see what you've written? If AI search ever becomes a thing, the open web will become an AI CAFO and search crawlers will increasingly end up imbibing the contents of its manure lagoon.
This problem has been a long time coming. Just over a year ago, Jathan Sadowski coined the term "Habsburg AI" to describe a model trained on the output of another model:
https://twitter.com/jathansadowski/status/1625245803211272194
There's a certain intuitive case for this being a bad idea, akin to feeding cows a slurry made of the diseased brains of other cows:
https://www.cdc.gov/prions/bse/index.html
But "The Curse of Recursion: Training on Generated Data Makes Models Forget," a recent paper, goes beyond the ick factor of AI that is fed on botshit and delves into the mathematical consequences of AI coprophagia:
https://arxiv.org/abs/2305.17493
Co-author Ross Anderson summarizes the finding neatly: "using model-generated content in training causes irreversible defects":
https://www.lightbluetouchpaper.org/2023/06/06/will-gpt-models-choke-on-their-own-exhaust/
Which is all to say: even if you accept the mystical proposition that more training data "solves" the AI problems that constitute total unsuitability for high-value applications that justify the trillions in valuation analysts are touting, that training data is going to be ever-more elusive.
What's more, while the proposition that "more training data will linearly improve the quality of AI predictions" is a mere article of faith, "training an AI on the output of another AI makes it exponentially worse" is a matter of fact.
Tumblr media
Name your price for 18 of my DRM-free ebooks and support the Electronic Frontier Foundation with the Humble Cory Doctorow Bundle.
Tumblr media
If you'd like an essay-formatted version of this post to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2024/03/14/14/inhuman-centipede#enshittibottification
Tumblr media
Image: Plamenart (modified) https://commons.wikimedia.org/wiki/File:Double_Mobius_Strip.JPG
CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0/deed.en
549 notes · View notes
azspot · 1 month
Quote
Ironically, AI companies are setting themselves up for this problem. Google and Microsoft's full-court press for "AI powered search" imagines a future for the web in which search-engines stop returning links to web-pages, and instead summarize their content. The question is, why the fuck would anyone write the web if the only "person" who can find what they write is an AI's crawler, which ingests the writing for its own training, but has no interest in steering readers to see what you've written? If AI search ever becomes a thing, the open web will become an AI CAFO and search crawlers will increasingly end up imbibing the contents of its manure lagoon.
The Coprophagic AI crisis
4 notes · View notes
Text
When you get a new follower who’s not a bot but an actual mutual
Tumblr media
6 notes · View notes
sabakos · 3 months
Text
In any scam, any con, any hustle, the big winners are the people who supply the scammers – not the scammers themselves. The kids selling dope on the corner are making less than minimum wage, while the respectable crime-bosses who own the labs clean up. Desperate "retail investors" who buy shitcoins from Superbowl ads get skinned, while the MBA bros who issue the coins make millions (in real dollars, not crypto). It's ever been thus. The California gold rush was a con, and nearly everyone who went west went broke. Famously, the only reliable way to cash out on the gold rush was to sell "picks and shovels" to the credulous, doomed and desperate.
Con artists start by conning themselves, with the idea that "you can't con an honest man." But the factor that predicts whether someone is connable isn't their honesty – it's their desperation. The kid selling drugs on the corner, the mom desperately DMing her high-school friends to sell them leggings, the cousin who insists that you get in on their shitcoin – they're all doing it because the system is rigged against them, and getting worse every day.
The rise and rise of botshit cannot be separated from this phenomenon. The botshit in our search-results, our social media feeds, and our in-boxes isn't making money for the enshittifiers who send it – rather, they are being hustled by someone who's selling them the "picks and shovels" for the AI gold rush
Cory Doctorow - Sympathy for the Spammer
56 notes · View notes
librarianrafia · 26 days
Text
"These "hallucinations" are a stubbornly persistent feature of large language models, because these models only give the illusion of understanding; in reality, they are just sophisticated forms of autocomplete, drawing on huge databases to make shrewd (but reliably fallible) guesses about which word comes next:
Guessing the next word without understanding the meaning of the resulting sentence makes unsupervised LLMs unsuitable for high-stakes tasks. The whole AI bubble is based on convincing investors that one or more of the following is true:
I. There are low-stakes, high-value tasks that will recoup the massive costs of AI training and operation;
II. There are high-stakes, high-value tasks that can be made cheaper by adding an AI to a human operator;
III. Adding more training data to an AI will make it stop hallucinating, so that it can take over high-stakes, high-value tasks without a "human in the loop."
These are dubious propositions. There's a universe of low-stakes, low-value tasks – political disinformation, spam, fraud, academic cheating, nonconsensual porn, dialog for video-game NPCs – but none of them seem likely to generate enough revenue for AI companies to justify the billions spent on models, nor the trillions in valuation attributed to AI companies:
https://locusmag.com/2023/12/commentary-cory-doctorow-what-kind-of-bubble-is-ai/
The proposition that increasing training data will decrease hallucinations is hotly contested among AI practitioners. I confess that I don't know enough about AI to evaluate opposing sides' claims, but even if you stipulate that adding lots of human-generated training data will make the software a better guesser, there's a serious problem. All those low-value, low-stakes applications are flooding the internet with botshit. After all, the one thing AI is unarguably very good at is producing bullshit at scale. As the web becomes an anaerobic lagoon for botshit, the quantum of human-generated "content" in any internet core sample is dwindling to homeopathic levels:
This means that adding another order of magnitude more training data to AI won't just add massive computational expense – the data will be many orders of magnitude more expensive to acquire, even without factoring in the additional liability arising from new legal theories about scraping:
5 notes · View notes
kennak · 11 days
Quote
「botshit」は昨年12月に生まれたばかりの造語だが、すでにインターネットはbotshitの肥溜めと化している。まともな収入を得る機会が激減し、高速椅子取りゲームまがいの経済状況にあって、絶望した人々は「不労所得」を求め、そして詐欺師に騙されるがままに山のようなbotshitを生成している。 https://pluralistic.net/2024/01/15/passive-income-brainworms/#four-hour-work-week botshitは想像を絶する速度と規模で生み出されている。Amazonが自費出版できる「冊数」を1日3冊に制限せざるを得なくなった理由は、よくおわかりだろう。 https://www.theguardian.com/books/2023/sep/20/amazon-restricts-authors-from-self-publishing-more-than-three-books-a-day-after-ai-concerns ウェブがbotshitの肥溜めとなり、インターネットのコアサンプルに含まれる人間製の「コンテンツ」の量はホメオパシーレベルにまで希釈されている。Cnetの記事から法的文書に至るまで、高品質とされる情報ソースでさえ、botshitに汚染されている。 https://theconversation.com/ai-is-creating-fake-legal-cases-and-making-its-way-into-real-courtrooms-with-disastrous-results-225080 皮肉なことに、AI企業自身がこの問題の火種を作っている。GoogleやMicrosoftによる「AI検索」の全面的な推進は、検索エンジンがウェブページへのリンクを返すのではなく、そのコンテンツを要約する未来を想定している。しかし、そうなれば誰がウェブを書くだろうか。あなたの書いたものを見つけられるのはAIのクローラーだけで、しかもそのAIはあなたの書いたものを自分のトレーニングの餌にするだけで、読者にあなたの書いたものを紹介する気は毛頭ない。AIが検索を支配すれば、オープンウェブはAIの工業的畜産場(CAFO)となり、検索クローラーはますます肥溜めからクソを吸い上げるようになるだろう。 この問題はずっと前から指摘されていた。1年ほど前、ジェイサン・サドウスキーは、ある機械学習モデルの出力で別のモデルを訓練することを「ハプスブルクAI」と名付けた。 https://twitter.com/jathansadowski/status/1625245803211272194 直感的にも、これはマズい考えだとわかるだろう。病気にかかった牛の肉骨粉を他の牛に与えるようなものだから。 https://www.cdc.gov/prions/bse/index.html 最近の論文「再帰の呪い:生成データでの訓練がモデルに忘却をもたらす(The Curse of Recursion: Training on Generated Data Makes Models Forget)」では、botshitを餌とするAIへの嫌悪感を超えて、その数学的帰結を掘り下げている。 https://arxiv.org/abs/2305.17493 共著者のロス・アンダーソンは、「生成されたコンテンツを使ってトレーニングすると、モデルに不可逆的な欠陥が生じる」と端的にまとめている。 https://www.lightbluetouchpaper.org/2023/06/06/will-gpt-models-choke-on-their-own-exhaust/ つまり、たとえ「訓練データを増やしさえすれば、(アナリストが数兆ドルもの評価額で喧伝する高価値アプリとしては全くもってふさわしくない)AIの問題は解決される」という信仰を受け入れたとしても、その訓練データの確保はますます難しくなるのだ。 さらに、「訓練データを増やせばAIの予測精度を線形的に改善する」という命題は単なる信仰に過ぎないが、「AIの出力を訓練データに使うと、AIは指数関数的に悪化する」のは事実なのである。
食糞AIがもたらす危機:botshitの肥溜めと化すインターネットの未来 | p2ptk[.]org
2 notes · View notes
rosszulorzott · 3 months
Text
3 notes · View notes
stanfave3-72217 · 4 months
Text
Beware the ‘botshit’: why generative AI is such a real and imminent threat to the way we live | André Spicer | The Guardian
5 notes · View notes
antti-nannimus · 26 days
Link
0 notes
gerdfeed · 26 days
Quote
But the pitch from "AI art" companies is "fire your graphic artists and replace them with botshit." They're pitching a world where the robots get to do all the creative stuff (badly) and humans have to work at a robotic pace, with robotic vigilance, in order to catch the mistakes that the robots make at superhuman speed. Reverse centaurism is brutal.
Pluralistic: Humans are not perfectly vigilant (01 Apr 2024) – Pluralistic: Daily links from Cory Doctorow
1 note · View note
jennbarrigar · 3 months
Text
0 notes
Text
Vice surrenders
Tumblr media
I'm on tour with my new novel The Bezzle! Catch me TONIGHT in LA with Adam Conover at Vroman's, then on MONDAY in Seattle with Neal Stephenson, then Portland, Phoenix and more!
Tumblr media
Vice died the way it lived: being suckered in by smarter predators, even as it trained its own predatory instincts on those more credulous than its own supremely gullible leadership. RIP, we hardly knew ye.
For those of you who don't know, Vice was a Canadian media success story. It was founded by a motley clique of hipsters, one of whom – founder of the Proud Boys – has since grown to be one of the world's great fascism influencers. Another perfected the art of getting young people to work "for exposure" even as he built a massive, highly lucrative media empire on their free labor:
https://www.canadaland.com/podcast/vice-oral-history/
Eventually, Vice transitioned to a string of progressively worsening corporate owners, each more dishonest, predatory – and gullible – than the last. The company was one of the most enthusiastic marks for Facebook's infamous "pivot to video" – in which Mark Zuckerberg destroyed half the media industry by tricking them into thinking that the public was clamoring for video content, based on fraudulent viewing numbers:
https://en.wikipedia.org/wiki/Pivot_to_video
Vice went all-in on video, spending hundreds of millions to finance Zuckerberg's doomed attempt to conquer Youtube. But unlike other the rubes who got zucked, Vice found greater fools to scam, convincing giant, slow-moving meidia companies that the best way to get in on the Next Big Thing was to shower them with vast sums of string-free money:
https://en.wikipedia.org/wiki/Viceland_(Canadian_TV_channel)
And yet, at every turn, through a succession of increasingly incompetent owners who bought the stumbling, declining Vice at fire-sale prices and then proceeded to hack away at the wages and tools its journalists depended on while paying executives salaries so high that they beggared the imagination, Vice's reporters continued to turn out stellar material.
This went on literally until the last moment. The memorial posted by 404 Media rounds up a selection of major stories Vice's beleaguered, precarious writers produced even as Vice's vulture capitalist leadership were pulling the rug out from under them:
https://www.404media.co/behind-the-blog-vices-legacy-and-the-idea-that-the-internet-is-forever/
True to form, those private equity scumbags locked all those workers out of the company's CMS without notice – and then forgot to lock down the podcasting back-end. That allowed a group of Vice veterans – Matthew Gault, Emily Lipstein, Anna Merlan, Tim Marchman and Mack Lamoureux – to gather for a totally unauthorized, tell-all session that they pushed out on an official Vice channel:
https://www.youtube.com/watch?v=TKT4OtDEJRA
Tumblr media
It's a hell of a listen. Not only do these Vice veterans have lots of fascinating history to recount, but they also describe the conditions under which those blockbuster stories of Vice's final days were produced. As the "visionary leaders" of the company paid themselves millions, they halted payments to key suppliers, from Lexisnexis to the interview transcription service the writers depended on. Writers paid out of pocket to search PACER court records.
Not only did Vice's reporters do incredible work under terrible and worsening circumstances, but the Vice writers who got out ahead of the total collapse are also doing incredible work. 404 Media is a writer-owned investigative news publisher founded by four Vice escapees – Samantha Cole, Jason Koebler, Emanuel Maiberg and Joseph Cox, which is both producing incredible work and sustaining the writers who founded it:
https://www.404media.co/
All of which leads to an inescapable conclusion: whatever problems Vice had, they didn't include "writers don't do productive work" and also didn't include "that work isn't economically viable*. Whatever problems Vice had, they weren't problems with Vice's workers – it was a problem with Vice's bosses.
Which makes Vice's final, ignominious punishment at the hands of those bosses even more brutal, stupid and inexcusable. According to the leaked memos emanating from the company's investors and their millionaire C-suite toadies, the business's new strategy is abandoning their website in order to publish on social media.
This is…I mean, this,..
This is…
Wow.
I mean, wow.
The thing is, the social media business model is a giant rug-pull. They're not even bothering to hide their playbook anymore. For social media, the game is to encourage media companies to become reliant on third parties to reach their audiences. Once that reliance is established, the companies turn down – or even halt – the ability of those media companies to reach their audience altogether. Then, they charge the media companies to reach their audiences:
https://www.eff.org/deeplinks/2023/06/save-news-we-need-end-end-web
Now, this wasn't always quite so obvious. Back when Vice was falling for Facebook's "pivot to video," it wasn't completely obvious that the long con was to take your audience hostage and ransom them back to you. But deliberately organizing your business to be reliant on social media barons today? It's like trusting your money to Sam Bankman-Fried…in 2024.
If there was ever a moment when the obvious, catastrophic, imminent risk of trusting Big Tech intermediaries to sit between you and your customers or audience, it was now. This is not the moment to be "social first." This is the moment for POSSE (Post Own Site, Share Everywhere), a strategy that sees social media as a strategy for bringing readers to channels that you control:
https://pluralistic.net/2022/02/19/now-we-are-two/#two-much-posse
Predicting that a social media platform will rug the media companies that depend on it today doesn't take a Sun Tzu – as cunning strategies go, the hamfisted tactics of FB, Twitter and Tiktok make gambits like "Lucy and the football" look like von Clausewitz.
The most bonkers part of this strategy is that it's coming from private equity bosses, who laud themselves as the great strategists of the 21st century, whose claim on so much of our global capital and resources is derived from their brilliant insight, which allows them to buy "distressed assets" like Vice, "restructure" them to find "efficiencies" and sell them on.
The reality is that PE goons – like other financiers – are basically herding animals. Everyone's hit on the tactic of buying up beloved media companies – from the 150-year-old Popular Science to modern publications like CNet – and then filling them with spammy garbage in the hopes that Google will fail to notice and continue to award them pride-of-place on search results pages:
https://pluralistic.net/2024/02/21/im-feeling-unlucky/#not-up-to-the-task
The fact that these billionaire brain-geniuses can't figure out how to "turn around" a site whose workers a) produce brilliant, popular, successful work; and b) depart to found successful firms that commercialize that work tells you everything about their ability to spot "a good business opportunity."
PE – like other mafiosi – only have one business-plan, the "bust out," where you invade a business that produces useful things, force them to pay your chosen suppliers sky-high fees for things they don't need, extract massive fees for your "management" and then walk away from the collapse:
https://pluralistic.net/2023/06/02/plunderers/#farben
Tumblr media
If you'd like an essay-formatted version of this post to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2024/02/24/anti-posse/#when-you-absolutely-positively-dont-give-a-solitary-single-fuck
253 notes · View notes
azspot · 3 months
Quote
The rise and rise of botshit cannot be separated from this phenomenon. The botshit in our search-results, our social media feeds, and our in-boxes isn’t making money for the enshittifiers who send it — rather, they are being hustled by someone who’s selling them the “picks and shovels” for the AI gold rush…
Cory Doctorow
4 notes · View notes
vaningyen · 3 months
Text
With the rise of generative AI and technologies such as ChatGPT, we could see the rise of a phenomenon my colleagues and I label “botshit”.
In a recent paper, Tim Hannigan, Ian McCarthy and I sought to understand what exactly botshit is and how it works.
Tudomány!!!
1 note · View note
ai-news · 4 months
Link
Unless checks are put in place, citizens and voters may soon face AI-generated content that bears no relation to realityAndré Spicer is professor of organisational behaviour at the Bayes Business School at City, University of LondonDuring 2023, the #AI #ML #Automation
0 notes
babyawacs · 4 years
Text
@law @law @harvard_law @ap @reuters @bbc_whys @france24 @haaretzcom @snowden court demanded fortune access wi th initiativ efromwithinthebubble and so thatit explains them their own realtime a s its an averted immunisation
@law @law @harvard_law @ap @reuters @bbc_whys @france24 @haaretzcom @snowden court demanded fortune access wi th initiativ efromwithinthebubble and so thatit explains them their own realtime a s its an averted immunisation
@law @law @harvard_law @ap @reuters @bbc_whys @france24 @haaretzcom @snowden
court demanded fortune access with initiativ efromwithinthebubble and so thatit explainsthem their own realtime a s its an averted immunisation
bot flip
no case occurs its on a sign bankruptcy dayfool trick thats charged on as repeated for years
thats botshit
thats arbitrarily procedural shitballing theyd try today fraud…
View On WordPress
0 notes