A NotKillEveryoneIsm Argument for Accelerating Deep Learning Research
(in which I fail to pass the ITT, I guess)
TLDR: This is really just a longer version of this comment.
A metaphor
You are Rocket McRocket-Face, the CEO of Rockets Inc, the world’s largest and most reputable rocket company. Rockets Inc isn’t the only rocket company in the world, but it is by far the biggest, richest, and most powerful rocket company on Earth. No other rocket company holds a candle to Rocket Inc. Nor are they likely to in the next 3-5 years.
As the CEO of Rockets Inc, you dream of one day reaching the moon. Your reasons for dreaming are twofold. The first reason is that reaching the moon is a dream that men have had since ancient times. It is an achievement truly worthy to behold. The second reason is a bit more practical. It is widely agreed in the world of rocket science that the first person to reach the moon will hold a commanding advantage. From the heights of the moon the first person to reach it will be virtually unconquerable, able to hurl moon rocks down to punish any of their enemies.
One day, two scientists come to you with research proposals for new types of rockets.
The first scientist is Mathy McEngineer. Mathy is one of your best engineers. He is well known for the reliability of his rockets. His design is comforting, simple-to-understand, and a natural extension of currently known rocket technologies. The design is so simple it can be explained to anyone with a degree in rocket-engineering in a few minutes. It involves taking the current well-known and trustworthy rocket designs and adding a few more parts: more engines, more fuel. Nothing out of the ordinary.
Mathy’s plan may not be brilliant, but it’s trustworthy and safe. And there’s a good chance that it will reach the moon (although it’s unlikely to reach even the nearest star).
After your meeting with Mathy, you are feeling good about your chances of winning the race to the moon and decide to leave work early that day. On your way out of work, you are suddenly stopped by Subtle McGenius. Immediately the good feeling you had vanishes as you feel a ball of anxiety in the pit of your stomach. You hate meeting with Subtle McGenius. He doesn’t even work for Rockets Inc (you fired him years ago), but somehow he always manages to show back up at the worst possible time. Times like right now.
“You’re going to go with Mathy’s plan. I just know you are. Aren’t you?” raves Subtle.
“Get out of my hallway!” You murmur. “Guards!”
“Mathy’s plan may seem good, but it will only take you to the moon. You will never reach the stars!”
Thankfully, the guards arrive in moments (you pay them well). As they drag Subtle away kicking and screaming, he shouts out one last retort.
“You can’t stop me!” Subtle shouts. “One day I’ll build my rocket. And then… then you’ll learn!”
There was no need point in listening to Subtle, however. You’ve heard his plans for a new “evolutionary” rocket design a million times. Like his name, his plans are subtle, crafty, and impossible to understand. Even when Subtle’s plans work, nobody knows why they work. You’ve invested billions of dollars trying to understand Subtle’s plans and failed. Even if Subtle’s plans work, there’s no way to make them safe enough to risk a human life on.
As you finish your ninth hole of golf, you feel good about yourself. Soon, with Mathy’s rocket, you will be golfing on the moon. Even if it will never take you to the stars.
The metaphor explained
Rockets in this metaphor represent AI research.
CEO Rocket McRocket-Face represents the audience. But the real-world person most like him is probably Sam Altman, CEO of OpenAI.
The moon represents human-level artificial intelligence. It is widely agreed that the first person or company to build an AI with human intelligence will gain a commanding lead over the rest of the world.
Mathy, and his rocket, represent Deep-Learning. Adding rockets and fuel represent scaling (adding more data and compute).
Subtle, and his rocket, represent evolution. While the designs produced by evolution are brilliant, even the simplest products of evolution are too complex for humans to understand.
Why this is a (notKillEveryoneIsm) argument for accelerating Deep Learning Research
The Deep-Learning paradigm is about as good-as-it-gets from an AI Safety perspective. Deep learning models are extremely logically simple. They are easy to interpret. They are extremely malleable to human control. And they are inherently myopic, which means they do not threaten to replace humanity. Finally, because they rely on huge data-centers full of GPUs, deep learning models are easy to shut-down.
By contrast, we know that evolution is capable of building much more intelligent and dangerous systems. Not only did evolution already produce human beings, but there is no inherent limitation on the types of algorithms that evolution can produce. If it is possible to develop dangerous super-intelligent AI, eventually evolution will find a way.
While evolution has an advantage in the long-term, Deep-learning currently holds the lead. I would argue that (if our only goal is to prevent the extinction of the human race), we should attempt to extend this lead as much as possible.
The risk (of not accelerating Deep Learning) is that as compute-power grows, it will eventually be possible for someone (Subtle in our story) to run a simulation of evolution on their computer that invents a new, more dangerous AI architecture. The algorithms produced by evolution are unlikely to be as easy to understand as current Deep-Learning models. Nor are they likely to be friendly towards human beings.
The benefit (of accelerating Deep Learning research) is that by increasing the intelligence available for humans to command, we have a better chance of solving the problem of friendly AI before someone develops a more powerful (and deadly) alternative paradigm.
Some obvious objections to this argument
But what if Deep Learning AI never actually reaches human-level?
That would mean Deep Learning models are inherently safe. This would only make the case for accelerating Deep Learning even stronger.
None of these are existential threats to the survival of the human race, so they are not objections to this argument.
What if accelerating Deep Learning also accelerates evolutionary algorithms?
Algorithmic improvements in Deep Learning are orthogonal to hardware advances relevant to evolutionary algorithms. If you plan to limit future increases in hardware performance that’s fine. But it’s irrelevant to the question of whether we should train the best Transformer we possibly can.
Even if we perfectly solve the LLM control problem, people can still use them to do bad things
If your opinion is that increasing the total productive capacity of the human race is bad, history is not on your side.
What about AI Agents?
There are five reasons to be optimistic about the LLM control problem: interpret-ability, passive safety, myopia, ease of feedback, and shutdown-ability. LLM Agents (if they work) mitigate 2 of these: myopia and passive safety. The other 3 are still sufficient. And more importantly much better than the safety guarantees we can expect for evolutionary algorithms (none).
Moreover, AI Agents only optionally mitigate myopia and passive safety. It is possible to build an AI agent that is passively safe (by requiring that it ask for permission) and myopic (by requiring it to consider only effects that are bounded in space and time).
we should freeze all Deep Learning Research anyway out of an abundance of caution
Haste can be cautious. Moving slowly can be dangerous.
This doesn’t actually solve the alignment problem
You’re right. But it puts us in a much better position to be able to solve that problem.
Conclusion
I believe that if you are only concerned about existential risk from AI, you should take actions that maximize the acceleration of Deep Learning models (aside from accelerating new types of general purpose compute hardware). This means training the largest models you can, optimizing algorithms for Deep Learning, and deploying Deep Learning as widely as possible throughout the economy.
I do acknowledge that there are other side-effects of widely deploying Deep Learning AI models. These include using AI for targeted information warfare, the loss of jobs, and other forms of social disruption caused by AI. However most of these happen when we deploy AI models, not when we develop them. Furthermore, it is likely that the benefits of Deep Learning AI models vastly outweigh the drawbacks.
As a policy proposal, I would recommend “full steam ahead” on training and researching Deep Learning Models combined with “careful but rapid” deployment of these models in the economy.