The perpetrator of the first great data heist was Facebook. For years, we happily wrote posts, uploaded pictures, tagged friends and family, used marketplaces and scrolled through content, encouraged — perhaps even brainwashed — to develop a deep addiction to the platform.
It was all fun and games for the early years until we learned the true goals of the platform. It wasn’t to “connect the world” — it was to bleed the world dry of data points. Facebook was actually a kind of data succubus, and its relentless harvesting allowed the company to develop the most sophisticated character profiles that could be targeted with unseen levels of accuracy by advertisers. To borrow the now all too familiar adage — Facebook wasn’t the product; we were the product.
For the most part, Facebook got off lightly. It was hit with a few fines that were the equivalent of pocket change. It denied it knew this or did that, and Zuckerberg would defend the platform, claiming data was a small price to pay in the mission to connect the world (ironically, something the platform no longer does). When egregious scandals like the Cambridge Analytica data harvest made news, the public finally showed a little bit of outrage… until it didn’t. A few people sounded the alarm, but most carried on posting, engaging and poking each other. (yup, still gross).
We should have learned at that moment that Big Tech is not to be trusted and that we were far too naïve with our personal data practices. We’d already let one company become far too powerful off the back of it, and we should be wary of letting it happen again.
That was the lesson. But heed it, we did not.
Instead, the majority don’t care. Millions of people still use Facebook, and the company continues to rake in an absolute fortune (over 120 billion this year alone) in advertising revenues.
And now, we’ve experienced the second great data heist, although this time, the safe was emptied before we even realized it.
I’m talking about AI companies.
When the first AI chatbots dropped into the mainstream, there was equal parts fascination and trepidation. Then came the penny drop – what data did they train these things on? And we realized at that moment that we’d been robbed again.
They’d gobbled up almost everything.
In fact, they’ve consumed so much of our output — our videos, our writing, our art, our photographs, our code — that they’ve got almost nothing left to use.
We’re seeing the impact of this; the progress between AI models has been stagnating. There have been stories that scaling laws are done. Leaps between models are smaller and less impressive. The question of whether this is the ceiling for generative AI is becoming more pressing for the industry. Just the other day, Sam Altman was downplaying AGI (despite also being the guy who said we should be deeply concerned about it), perhaps in a way to temper expectations of what AI can achieve.
It all comes down to the data, and the models are running out of high-quality, human-created data to train on.
What I don’t get is, are we supposed to sympathize with that plight?
Personally, I think it’s what the whole industry deserves. Nobody has been compensated for this data. Nobody even got to opt in until they had already stolen most of it. A few overlords took everything we’ve ever made, with literal blood, sweat and tears, and threw it into a database, all so some AI bot can regurgitate it back to us, minus all the beauty and craft, and with all the hallucinations of someone on acid whose brain just melted. And now they want to serve us this AI sludge in subscription services that could cost up to $200 a month?
In effect, they are offering us the chance to pay them to have our own work used against us.
That should cause outrage.
Yet, again, the majority don’t care. OpenAI claims ChatGPT now has over 300 million weekly users. It seems we’re happy to let the AI industry off the hook.
I saw someone bemoan the other day about “how long the rift between pro- and anti-AI will go on for.” I think we’re only at the start. Next year will see stronger pushback as generative AI continues to fail to justify the resources it uses, fails to justify the frankly insane money that’s been pumped into it, and fails to find genuine use cases beyond ‘do some thing a little faster.’ The biggest reason though, is that GenAI will continue to try and weave its way into the creative industries, offering to automate the roles of so many, all off the back of the work those very people created.
It’s ironic. It’s sad. And if you’re not pissed off, you should be.
Open AI's business model is - essentially - based upon theft. Imagine going into a business meeting and explaining to potential investors that your entire strategy is to steal people's stuff and then sell it back to them?
Deez timez!
…i’m not so convinced people will get upset by this…more and more of everyone i meet every day is “into it” and tells me how cool the thing they do with it is…this is unfortunately too big to fail…i think we are closer to a.i. creating a doomsday then anyone in tech ever quitting building and growing its capabilities…but maybe when we all go poor broke and hungry we’ll fight back?…