If anyone builds it, everyone dies (Part 5: Solutions)

Today, there is a thriving field of AI safety researchers … The subtext of If Anyone Builds It — occasionally rising to text — is that Yudkowsky thinks that all of these people are idiots whose various research projects stand no chance of preventing our destruction … MIRI’s institutional stance is that the rest of the field is delusional because they don’t want to acknowledge that we’re obviously doomed … To more mainstream AI safety organizations, this position entails rejecting valuable work in favor of policies that can’t be implemented and would tank the global economy if they were.

Clara Collier, “More was possible: A review of If anyone builds it, everyone dies

1. Introduction

This is Part 5 of my series If anyone builds it. This series discusses and responds to some key contentions of Eliezer Yudkowsky and Nate Soares’ book, If anyone builds it, everyone dies.

Part 1 introduced the book alongside some key argumentative cruxes.

Part 2 and Part 3 looked at Yudkowsky and Soares’ arguments for misalignment.

Part 4 looked at Yudkowsky and Soares’ arguments that humanity would lose a conflict against misaligned artificial superintelligence.

Suppose that is right. Suppose superintelligent agents are likely to be misaligned, and that we would lose a conflict with them. What policy solutions should we enact today? That is the subject of Part 3 of Yudkowsky and Soares’ book, as well as the subject of today’s post.

2. Engineering and policy solutions

The first two chapters of Part 3 argue that aligning artificial intelligence is difficult and make the case against placing too much stock in current technical alignment efforts.

I will not address this argument because I think that the prospects for technical alignment strategies to succeed should be discussed technically. I do not think that Yudkowsky and Soares’ argument is sufficiently engaged with technical details to advance the discussion. I also think that the best response to Yudkowsky and Soares’ argument would come from a technical researcher within this field.

The second two chapters of Part 3 take the difficulty of aligning superintelligent systems as established and ask what we should do about it. Let us locate Yudkowsky and Soares’ solution in the space of policy proposals before looking at some concrete details of their policy proposal.

3. Locating the solution in policy space

There are a wide range of strategies for promoting AI safety.

Technical researchers in the burgeoning field of AI safety can develop technical strategies for making AI systems safer and better-aligned with human preferences.

AI companies can conduct in-house safety tests and impose safety standards on their own products.

Governments and international organizations can implement legislation and treaties to enforce standards of safety and transparency.

Policy organizations and think tanks can spread awareness of threats and capabilities to ensure that technical solutions and legal regulations are keeping pace with capabilities.

Many of these efforts are badly behind. In the near future, we will be lucky if AI safety is handled any more responsibly than other pressing global challenges such as climate change.

Increasingly many AI safety advocates want to focus on these and similar achievable paths towards safer artificial intelligence. Yudkowsky and Soares think that such strategies are largely ineffective and potentially harmful:

The solutions we’ve just proposed are a far cry from the policies that other concerned folks propose … They downplay, they hedge, they point out ways that dumber AIs will have an effect on society and suggest that dumber AIs should be regulated accordingly, while slipping in some clauses that lay the groundwork for regulating the sort of AI that could kill us all … Perhaps their work raises a little awareness about these issues so that, in the future, bolder legislation can be passed. Perhaps the reporting requirements they recommend will eventually be passed, and enacted, and cause some bureaucrat later on to observe some danger signs in AI development and alert world leaders. But we, ourselves, have started to lose hope in that whole strategy working in time.

Yudkowsky and Soares have a simple alternative policy prescription: shut it down. Yudkowsky and Soares are light on details here, though they do propose an international treaty to capture some of their recommendations. Nevertheless, it may be worth looking at the broad outline of their policy proposal.

4. Elements of Yudkowsky and Soares’ solution

Yudkowsky and Soares propose that the development of artificial superintelligence should be banned. Concretely, they call for at least two concrete actions to implement a superintelligence ban:

First, computing resources should be consolidated.

All the computing power that could train or run more powerful new AIs, gets consolidated in places where it can be monitored by observers from multiple treaty-signatory powers, to ensure those GPUs aren’t used to train or run more powerful new AIs.

Second, a sizable portion of artificial intelligence research should be prohibited:

It should not be legal—humanity probably cannot survive, if it goes on being legal—for people to continue publishing research into more efficient and powerful AI techniques.

These policies would, understandably, prove unpopular. Corporations would have to forgo sizable monetary gains. Countries would have to forgo large geopolitical gains. And municipalities and individuals would have to forgo sizable improvements in their quality of life. This creates the need for strong enforcement of a superintelligence ban.

What form should that enforcement take? While Yudkowsky and Soares do not elaborate in the main text on their second proposal, to prohibit large chunks of research, they do elaborate on how tightly computing resources should be controlled. If an actor begins to consolidate computing resources, they should be warned:

If intelligence services spot a huge unexplained draw of electrical power that could correspond to a hidden datacenter containing chips that have not been accounted for, and that country refuses to allow a party of international observers to investigate, they get a somberly written letter from multiple nuclear powers warning about next steps.

What are those next steps? Just about anything appears to be on the table.

In this scenario, the other powers must communicate that the datacenter scares them. They must ask that the datacenter not be built. They must make it clear that if the datacenter is built, they will need to destroy it, by cyberattacks or sabotage or conventional airstrikes. They must make it clear that this is not a threat to force compliance; rather, they are acting out of terror for their own lives and the lives of their children. The Allies must make it clear that even if this power threatens to respond with nuclear weapons, they will have to use cyberattacks and sabotage and conventional strikes to destroy the datacenter anyway, because datacenters can kill more people than nuclear weapons.

For Yudkowsky and Soares, all manner of responses could be warranted if necessary to prevent the construction of a sizable datacenter.

5. Feasibility

In a world increasingly incapable of meeting immediate and clear security challenges, such as the militarization of artificial intelligence or the need for comprehensive regulatory frameworks and sizable regulatory agencies, it is doubtful that policymakers and citizens would be willing to commit to a total ban on artificial intelligence. It is far more doubtful that we would be willing to commit to such ironclad enforcement mechanisms, or that a commitment to these enforcement mechanisms would be honored.

Many AI safety advocates are growing increasingly frustrated with the MIRI agenda. They see this agenda as centered around a range of vague and infeasible policy proposals. They worry that these proposals come at the expense of political capital and public attention that might be better spent on more tractable proposals. And they worry that MIRI has positioned itself, rhetorically and politically, in opposition to mainstream AI safety advocates.

Perhaps these concerns were expressed best by Clara Collier:

Today, there is a thriving field of AI safety researchers … The subtext of If Anyone Builds It — occasionally rising to text — is that Yudkowsky thinks that all of these people are idiots whose various research projects stand no chance of preventing our destruction … MIRI’s institutional stance is that the rest of the field is delusional because they don’t want to acknowledge that we’re obviously doomed … To more mainstream AI safety organizations, this position entails rejecting valuable work in favor of policies that can’t be implemented and would tank the global economy if they were.

A similar position is voiced by Will MacAskill:

The positive proposal is extremely unlikely to happen, could be actively harmful if implemented poorly (e.g. stopping the frontrunners gives more time for laggards to catch up, leading to more players in the race if AI development ends up resuming before alignment is solved), and distracts from the suite of concrete technical and governance agendas that we could be implementing.

For Yudkowsky and Soares, there is not much room to budge here. Yudkowsky and Soares think that any policies short of a total ban are quite likely to kill us all.

For those who share their views, Yudkowsky and Soares’ policy prescriptions may be exactly what is needed. Even then, I have my doubts: most policy progress has a substantially more gradualistic character full of compromises and evolving ambitions.

But more importantly, a very broad range of people agree on the need for increased action on AI safety. A recent Nature editorial pleads, “Let 2026 be the year the world comes together for AI safety.” What they have in mind is a much more modest and immediate range of goals and policy proposals:

A wide spectrum of national and regional laws and regulations are in place or under development. Some countries, for example, are looking to ban ‘deepfake’ videos. This should be a universal goal. Companies should also provide details of the data used to train models, and need to ensure that copyright is respected in the training process. The overriding ambition must be to achieve regulations similar to those governing other general-purpose technologies. AI developers … need to transparently explain how their products work, demonstrate that their models have been produced through legal means, and show that the technology is safe and that there is accountability for risks and harm. Transparency is also needed from researchers, more of whom need to publish their models in the peer-reviewed literature.

Those with similar inclinations are not going to find much in Yudkowsky and Soares’ proposals that they can work with. They are likely to view Yudkowsky and Soares’ proposals as unrealistic hindrances to needed progress towards safer artificial intelligence.

6. Strategies for action

To be fair, Yudkowsky and Soares do provide some more actionable recommendations in their final chapter.

Those in government are advised to signal their country’s willingness to resist AI escalation.

Elected officials are urged to bring these issues to colleagues’ attention.

Journalists are asked to warn the public about the risk and badness of existential catastrophe.

Ordinary citizens are asked to vote, protest, and contact elected officials.

These proposals are much closer to those that would be recommended by mainstream AI safety advocates, though there will remain some disagreement over the focus on existential risks.

I don’t think that most readers come to Yudkowsky and Soares’ book looking for policy guidance. In that sense, I don’t think that most readers will be terribly surprised to find guidance that they may find infeasible or extreme. However, insofar as there is some actionable policy guidance in the book, it is probably closer to the above than to the book’s main proposal or to their draft international treaty.

7. Taking stock

This post concludes my series on Yudkowsky and Soares’ book, If anyone builds it, everyone dies. The book has certainly made an impact, with reviews in mainstream venues and splashy billboards designed through a LessWrong contest.

This series suggests that If anyone builds it does not make a persuasive case for its key contentions.

We saw in Part 1 of this series that most of the argument for misalignment is concentrated in a single chapter: Chapter 4. Part 2 and Part 3  of this series found that Yudkowsky and Soares’ positive arguments for misalignment fall short of the mark.

If humanity did face off against a misaligned superintelligence, how would we fare? We saw in Part 4 of this series that most of Yudkowsky and Soares’ arguments here trade on a range of unargued scenarios involving agents far more capable than the minimally superintelligent agents the book previously discussed. These scenarios do not do much to advance the state of play, nor is it clear that Yudkowsky and Soares intend them to.

If humanity wants to confront the threat of artificial intelligence, how should we respond? In today’s post, we saw that Yudkowsky and Soares provide a familiar, one-note solution: shut it down at all costs. This solution is distinguished at once by its infeasibility and by its singular disinterest in engaging with the details of the increasingly broad and promising range of proposals for developing safe artificial intelligence.

Yudkowsky is the acknowledged godfather of the field of AI safety, but this book remains stuck in the generation where Yudkowsky rose to fame. The quality of argumentation, engagement with technological details, and feasibility of policy proposals is simply no longer up to scratch.

If this is the best that Yudkowsky has to offer, I hope that serious consideration will be given to the role that Yudkowsky and MIRI will be afforded in conversations about AI safety going forward. This kind of contribution is holding the field back.

Comments

Leave a Reply

Discover more from Reflective altruism

Subscribe now to keep reading and get access to the full archive.

Continue reading