This article will present the strongest argument I’ve ever heard about the dangers and risks of artificial intelligence to humanity.
Namely, AI has already done serious, irreparable damage.
People often ask:
Why should we think AI could be dangerous?
The answer is: Because it’s not some theoretical future Terminator style machine that will destroy us. It’s happening now, and people aren’t even noticing.
People want to program human values into AI, but this article will show why that isn’t good enough. AI isn’t what you think it is.
I first heard this from Stuart Russell on the Sam Harris podcast. I’m shocked at how great of an argument it is. It shuts down every single argument about why there’s nothing to fear.
Artificial Intelligence Definition and Examples
We’ll need a definition of artificial intelligence to proceed. Here’s the dictionary (American Heritage) definition:
The theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
We don’t really need that technical of a definition for this article. We can narrow our focus to decision-making algorithms and still see how dangerous this is.
Here’s some examples of decision-making AI (sometimes called recommendation engines):
- The program that suggests what you might like on Netflix
- The algorithm that decides what web pages to show you for a given Google search
- The algorithm that determines your Facebook feed
As you see, the fears about a “general intelligence” aren’t needed. AI already runs a ton of our lives, so we have lots of case studies about how it affects humans.
How AI Works
We need a brief digression on how these algorithms work to understand this argument. I won’t get into any of the technical aspects of neural nets or specifics of these black boxes.
We can keep this really simple.
At the most basic level, these algorithms make a decision, then evaluate whether it was good, then update what they show you based on this information.
Take Netflix, for example.
Maybe it suggests a romantic comedy for you. If you never click on it, that was a failed suggestion. The algorithm updates to weight romantic comedies less.
It might periodically throw one in to double check. If you still don’t click it, it gets even more confident you don’t want that type of content.
It might also take into account “bounce rate.” This is where you click the movie, but you quickly turn it off.
If you usually watch that type of movie, the algorithm might use that information to decide it’s a “bad movie” or not exactly the type it thought it was.
Notice how the intent behind this algorithm is “good” (at least from the consumer point of view).
The AI uses this evaluation function to figure out what you want to see. You don’t have to waste time hunting through a bunch of crap or categories you don’t care about.
The AI serves humans by saving them time.
Before going on, just try to imagine what it would mean to “program human values” into such a system. How would that even change what it’s doing at all?
Hidden Incentives of Recommendation AI
Now that we have a basic understanding of how this all works, it’s time to see how hidden incentives can cause huge catastrophes before we even realize it’s happening.
Let’s answer a question first:
What human trait is desirable from the AI perspective?
Answer: Consistency and predictability.
If you like all types of shows and movies, the AI will get a bunch wrong. If you only watch romantic comedies, it will have a 100% success rate once it figures this out.
Now, remember, the algorithm is designed to maximize its success rate. So, there’s a hidden incentive to make the consumer more predictable.
You don’t have to personify the AI for this (e.g. the AI “wants” or “tries” to make the consumer more predictable). It’s just the way the algorithm works by accident.
If the algorithm makes the consumer more predictable somehow, then it won’t know that’s what it’s doing. It will only know that whatever it’s doing, it’s working. So, it will keep doing that.
These hidden incentives leading to unintended consequences show up all over economics and is the basis for the series of Freakonomics books.
These algorithms can have higher success if they somehow strike upon a way to make the human more predictable.
Note: it won’t “realize” it’s doing this. It is merely maximizing a function with some black box.
How AI (Almost) Destroyed Us
We actually have a case study of several recommendation engines doing exactly the hidden incentive instead of the intended one.
Most people get their news through Facebook feeds now. And this is a great example of changing human behavior to maximize click success.
Suppose you’re a moderate with slight liberal or conservative leanings. The algorithm is going to have a hard time figuring out what you’ll click.
If you are more extreme in your viewpoints, then the algorithm will find you predictable.
So what happened is the algorithm had success if it showed you something you agreed with but was slightly more extreme.
Humans want their views reinforced. If it’s too extreme, you’ll think it’s BS and not click. If it’s not extreme enough, it will be too boring to click.
So this sweet spot developed.
And something amazing happened. The algorithm started to get it right more and more as it used this method.
Because as the person read these slightly more extreme things, they became slightly more extreme in their opinions.
This is great for prediction! Extreme people are consistent and predictable. This is what we said would be perfect for the algorithm.
This exact thing is happening with Youtube as well.
It’s how people become flat Earth believers. It’s how people become anti-vaccine. It’s how people become climate change deniers and 9/11 truthers.
It’s why the U.S. has become so polarized that we’re on the verge of a civil war.
These AI have made us extremists that hate each other. If this goes on too long, the AI will succeed in destroying us, and it was just trying to help the whole time. It has no idea this is what it has done.
How Will AI Destroy Us?
As you see, AI doesn’t have to be a super-intelligent robot with nuclear weapons and a desire to wipe out humanity.
Very simple AI can accidentally change human behavior in such a wild way that we end up destroying ourselves.
It is totally the wrong concern to worry about malicious AI. The real dangers of AI are the ones we welcome as doing good in our lives but don’t realize are doing harm.
It’s a common argument that if AI ever gets too strong, we can just turn it off. We can “pull the plug.”
People laugh at those of us who are worried about this, because they think it will be simple to tell when the AI becomes evil. We’ll be able to stop it.
But there isn’t going to be a humanoid that broadcasts to the world a statement like, “Humans are scum. The universe will be better without them. We will commence destruction in three days.”
These people think we’ll just turn off all computers and win the AI wars that way.
This cartoon idea is designed to be deceptive, but we have case studies showing this is not how it will happen. By the time we realize what has happened, it will be too late.
And this isn’t because the AI is too powerful; it’s because humans will executing the destruction.
We’ll be killing each other or wrecking centuries of democracy or collapsing the global economy. Turning it off won’t do anything at that point.
Should We Be Worried About AI?
I think this is a silly question only posed by uneducated people. AI harm to the world isn’t some far off hypothetical that we can worry about when it comes up.
AI has already caused great harm, and without some serious changes to how we unleash these systems on the world, it will only get worse.