Harms (Part 1: Distraction)

The idea that AI could lead to human extinction has been discussed on the fringes of the technology community for years … Like a magician’s sleight of hand, it draws attention away from the real issue: the societal harms that AI systems and tools are causing now, or risk causing in future. Governments and regulators in particular should not be distracted by this narrative and must act decisively to curb potential harms. And although their work should be informed by the tech industry, it should not be beholden to the tech agenda.

Nature editorial, “Stop talking about tomorrow’s AI doomsday when AI poses risks today
Listen to this post

1. Series introduction

Interventions aimed at reducing existential risk are usually evaluated by multiplying the probability of preventing risk by the value of risk prevention. If the resulting quantity is higher than the value of competing short-termist interventions, it is concluded that existential risk mitigation should be favored.

This approach neglects an important aspect of the value of existential risk mitigation. Interventions taken to reduce existential risk have not only potential benefits, but also potential harms. This series, “Harms” discusses some of the most important harms risked by leading existential risk mitigation efforts.

It is important to discuss the harms of existential risk mitigation for two reasons. First, incorporating harms into the picture allows us to correctly evaluate the case for existential risk mitigation. Second, investigating harms and their causes may help us to identify ways to lessen the harms risked by current existential risk mitigation efforts.

2. Post introduction

One of the most frequent objections to AI risk discourse is that it distracts from the very real harms caused by artificial systems today. This objection becomes polarizing and unhelpful when it is used as a way to avoid engaging with AI risk arguments, or incorporated into a default type of zero-sum thinking which assumes without argument that attention paid to existential risks must necessarily detract from other risks posed by AI.

This is a shame. There are important things to be said in favor of the distraction objection, and these can be rigorously and calmly stated and discussed in a way that will move debates forward. Today’s post aims to investigate what might lie behind the distraction objection.

3. AI harms: Beyond existential risk

Artificial intelligence is responsible for a great number of harms in society today. These harms are important, and deserve to be addressed. Here are a few of the many harms, beyond existential risk, imposed by contemporary systems.

First, many systems are biased against some of the most vulnerable communities, including people of color, women, and transgender individuals. This exacerbates existing racial inequities in the criminal justice system, and leads to a suite of technologies that work significantly better for light-skinned men than for women of color.

Second, existing systems often lack meaningful transparency, interpretability or explainability. For example, workers are evaluated, promoted, hired and fired by algorithms with little or no explanation, threatening important rights such as the right to informed self-advocacy.

Third, intellectual property is increasingly fed wholesale into generative AI systems with little in the way of compensation for artists, journalists, and other content creators.

Fourth, artificial intelligence is concentrating power among a small number of technology firms and nations, already among the world’s wealthiest and most powerful.

Fifth, an escalating global arms race to generate AI systems has led to heightened international tensions, protective trade policies, and escalating military investment in AI-powered weapons systems.

Sixth, privacy is increasingly impossible, with harms falling most strongly on vulnerable groups. As I write these words (but hopefully not when I post them), deepfake pornography of Taylor Swift is circulating widely over the internet, and women across the world are worried that they will never again regain possession of the digital representation of their own bodies.

Seventh, climate change is surpassing even many of our worst fears, and the world continues to miss targets for climate abatement and mitigation. Compute-hungry AI systems are significant contributors to climate change, the harms of which fall strongest on many of the world’s most vulnerable groups.

Finally, misinformation is beginning to run rampant. There are credible fears that this year’s elections in many countries will be significantly sabotaged by AI-generated misinformation, though it is important not to exaggerate these fears.

These are important harms. Regardless of our opinions about existential risk, we should think that it would be valuable to reduce these harms and costly to distract from them. However, existing AI risk discourse threatens to distract from near-term harms of artificial intelligence in several ways.

4. Corporate behavior

Corporations such as OpenAI, Anthropic, X, Microsoft and Google are directly responsible for many of the near-term harms of AI systems. Each of these companies consumes a staggering amount of electricity to train generative AI models. The deepfake pornography of Taylor Swift mentioned above was generated by AI systems and is circulating on social media, in particular on X.

All of the above-named firms have seen strong growth in firm value as well as the power to shape international discourse around the harms generated by AI. We are currently experiencing a remarkable situation in which the very same firms producing harmful technologies are able to use risk discourse to position themselves as leading and benevolent actors in global discourse about the risks posed by artificial intelligence.

By positioning themselves as combatants in a fight against existential risk, corporations are able to paint themselves as safety-focused, thereby reducing pressure to mitigate these harms.

Elon Musk, widely thought to be one of the worst actors in propagating immediate harms, paints his new AI company, xAI, as safety-focused by appointing the director of the EA-Aligned Center for AI Safety, Dan Hendrycks, as its safety advisor. Rather than address rampant misinformation, harassment, racism and other problems on X, Musk tells us that he is helping humanity to survive by colonizing Mars, and addressing existential risk to prevent potential threats to a future Martian colony.

OpenAI commits 20% of its compute to superintelligence alignment, while at the same time facing a range of lawsuits for harms including intellectual property infringement. And for all of OpenAI’s talk about aligning superintelligence, even many effective altruists doubt the company’s focus on safety: OpenAI has continually rushed to release powerful generative AI models, and their discourse around existential risk is so notoriously unconnected to action that effective altruists recently took to picketing OpenAI. The recent board shakeup at OpenAI was precipitated in part by safety-focused criticism of OpenAI by EA-aligned board members, and the resounding defeat of those board members suggested to many that OpenAI may not be as safety-focused as it claims.

At the very least, discourse and spending around existential risk allow powerful AI companies to distract from their own roles in causing near-term harms and to paint themselves as significantly more safety-focused than they often are. At the worst, existential risk discourse can be used to gain a reputation for virtue against a background of severe misbehavior.

We have spoken on this blog about how Sam Bankman-Fried used his commitment to longtermism to gain a reputation for trustworthiness and to secure public influence that led to increased investment and delayed important government investigations into fraud at FTX. While I would not go so far as to allege that Bankman-Fried’s behavior is typical of safety-focused corporate executives, this is far from the first time that EA-aligned technology leaders have found themselves in criminal trouble. Another EA megadonor, Ben Delo, pled guilty to violations of the Bank Secrecy Act for turning a blind eye to money laundering on his cryptocurrency platform.

More generally, a recent investigation of Giving Pledge signatories suggests that a whopping 41% of signatories have been accused of substantial misconduct, 10% convicted, and 4% jailed. Readers would be well within their rights to ask whether the altruistic commitments of many of these parties are entirely sincere, and whether their professed altruism may have helped some of these parties to secure a financially useful reputation for undeserved virtue.

5. Public attention

It is widely held that we have shifted from an information economy to an attention economy. With many problems competing for public attention, only the loudest voices are heard, and it is possible to get away with a great deal so long as the public is not paying attention.

Political leaders are making precious little progress on many of the near-term harms of artificial intelligence. They can, however, distract from these failures by announcing initiatives directed at the most attention-grabbing harms. There is no harm more attention-grabbing than human extinction, so it should come as little surprise that politicians seize on existential risk discourse in part as a way to distract from their failures to address more immediate harms.

For example, recent effective altruist lobbying in London is widely touted as a success of the policy arm of effective altruism. No less a figure than Prime Minister Rishi Sunak convened an international AI Safety Summit, leading to the creation of a UK AI Safety Institute explicitly focused on mitigating extreme threats including existential risks.

With an initial £100 million investment in extreme AI risk, the UK government proudly announces that it is “providing more funding for AI safety than any other country in the world.” But what the government does not stress is just how little is being done to address more mundane harms, such as bias, transparency, misinformation and climate impacts.

It is worth asking whether vocally supportive politicians such as Rishi Sunak and business leaders such as Sam Altman might be taking effective altruists for a ride, using the public interest in existential risk together with the effective altruist funding and public relations machinery to boost their own reputation for safety on the cheap. Sunak, at least, could certainly use the reputation boost.

A second political concern about existential risk discourse is that it may not only distract from, but even directly exacerbate key short-term harms. For example, I noted earlier that a global AI arms race is underway, and this arms race is accompanied by increasing tensions between key players scrambling for supremacy and control of key resources. It has been widely reported that longtermist lobbying in Washington led to strengthened export restrictions on chips sent to China, worsening an already tense relationship between the world’s largest superpowers. Worsening US-China relations are not something to take lightly, and I hope that they will be correctly figured on the balance sheet of AI lobbying.

6. Academic research

I have spoken before on this blog about the extent to which money talks within academia. Academics need money, and few funders can compete with Silicon Valley. The result, I argued, is an inflation of authority and seriousness afforded to positions favored by effective altruist donors; topical biases in the topics discussed by researchers; and opinion biases in the opinions expressed in published research.

Philanthropic money can be used to capture or create research centers, journals, conferences, scholarly societies, funding boards, and other powerful research institutions. Thankfully, studies of existential risk from artificial intelligence have had a difficult time making inroads into academic discussions, but they have lately had some notable successes.

For example, the EA-funded Center for AI Safety attracted a remarkably good class of philosophy fellows, many of whom had not previously worked on or expressed sympathy for discussions of existential risk. This collaboration led, among other things, to a special issue of a leading philosophical journal, Philosophical Studies whose contents are likely to be substantially more skewed towards discussions of extreme risks than is the norm in philosophical research.

There are some other signs that longtermists may be making inroads into academic discussions. For example, the National Science Foundation partnered with Open Philanthropy and Good Ventures to offer $20 million in funding for “safe learning-enabled systems”. I hold out hope that the National Science Foundation, which is widely considered to be a sensible and mainstream funding agency, does not intend to use this money to fund fringe research into existential risks, but some scholars are concerned that this money may drive the field closer towards opinions and topics favored by longtermists, and I think that this must be what Open Philanthropy intends to get for their money.

Of all the distractions considered in this post, I have to say that I am least concerned about turns within academic research. Academics have been, as a rule, quite skeptical of existential risk arguments, and are unlikely to change their minds unless longtermists can provide a good deal more evidence in support of those arguments. However, longtermists have been remarkably successful and working their way into other corners of academia, and we should not ignore the possibility that they will be able to substantially shift academic discourse around AI risk.

7. A track record of ignoring present harms

An important lens into the longtermist push to downplay present harms comes from the observation that this is not the first, but rather the second time that the longtermist community has advocated longtermist policies at the expense of near-term harms.

Many longtermists, including Eliezer Yudkowsky, previously advocated building superintelligent artificial systems and sought to build the systems themselves. What did they say about the problems facing the world right now? At least for Yudkowsky’s part, he suggested ignoring those problems because future artificial systems would solve them:

I have had it. I have had it with crack houses, dictatorships, torture chambers, disease, old age, spinal paralysis, and world hunger. I have had it with a planetary death rate of 150,000 sentient beings per day. I have had it with this planet. I have had it with mortality. None of this is necessary. The time has come to stop turning away from the mugging on the corner, the beggar on the street. … Our fellow humans are screaming in pain, our planet will probably be scorched to a cinder or converted into goo, we don’t know what the hell is going on, and the Singularity will solve these problems. I declare reaching the Singularity as fast as possible to be the Interim Meaning of Life, the temporary definition of Good, and the foundation until further notice of my ethical system.

Here Yudkowsky begins with a correct recognition that the world faces many problems today. Instead of advocating the natural solutions, such as global health and development work as well as climate mitigation, Yudkowsky suggests ignoring them and devoting our efforts to developing superintelligent AI systems which will then take care of most or all present problems. On this basis, Yudkowsky founded an institute dedicated to bringing about the singularity as a means of solving the world’s problems.

Today, longtermists have changed their views about artificial intelligence. They no longer see artificial intelligence as the solution to the world’s ills, but rather as the chief threat to human civilization. But longtermists retain their predecessors’ enthusiasm for diverting money previously reserved for evidence-based solutions to short-term problems to study future AI systems instead.

This discussion suggests that neglecting and distracting from present problems is not a new phenomenon within the longtermist community, and it may not be tied to any particular worldview.

8. What might be done

Distraction is an eminently soluble problem. Everyone knows how to draw attention to an issue, so there is plenty that longtermists can do to mitigate the effects of distraction if they are willing.

First and foremost, longtermists can fund and conduct research and advocacy around risks beyond existential risk. This is the most direct way of ensuring that a broad range of research is conducted and publicized.

Second, longtermists can consider and weigh ways in which their own efforts may contribute to near-term risks. For example, they should be mindful that even favored companies such as Anthropic and (previously, OpenAI) do a great deal to exacerbate near-term harms, and that supporting such companies may contribute to near-term harms. Longtermists should also carefully consider the effects of lobbying on the AI policy agenda in Washington, London and elsewhere.

Finally, longtermists should aim never to dismiss or speak casually about non-existential risks posed by artificial intelligence. Treating risks with the seriousness that they deserve can help to ensure that these risks are addressed seriously.

Of course, to say that longtermists know how to address distraction is not to guarantee that these efforts will be made. I rather suspect that most longtermists regard strategies for mitigating distraction as cost-ineffective. But if that is right, then longtermists should say so directly, and should figure the full cost of distraction on the balance sheet of existential risk mitigation efforts.

Comments

13 responses to “Harms (Part 1: Distraction)”

  1. Vasco Grilo Avatar
    Vasco Grilo

    Thanks for the post, David. I wonder whether at the margin it is better to direct resources towards mitigating harms which are smaller but more likely (e.g. misinformation), or larger but less likely (e.g. extinction), accounting for the indirect benefits of both efforts. In effective altruism, it is overwhelmingly assumed that society is underinvesting in mitigating larger harms, and I used to mostly agree with this, but I am no longer confident. To illustrate, I commented that (https://forum.effectivealtruism.org/posts/tRbCjvm4cuuzm95Mv/nuclear-risk-and-philanthropic-strategy-founders-pledge?commentId=jHxMDHeeexwKNZhhy):
    – If the goal is saving lives, spending should a priori be proportional to the product between deaths and their probability density function (https://en.wikipedia.org/wiki/Probability_density_function). If this follows a Pareto distribution (https://en.wikipedia.org/wiki/Pareto_distribution), such a product will be proportional to “deaths”^-alpha, where alpha is the tail index.
    – “deaths”^-alpha decreases as deaths increase, so there should be less spending on more severe catastrophes. Consequently, I do not think one can argue for greater spending on more severe catastrophes just based on it currently being much smaller than that on milder ones.
    – For example, for conflict deaths, alpha is “1.35 to 1.74, with a mean of 1.60” (https://forum.effectivealtruism.org/posts/PyZCqLrDTJrQofEf7/how-bad-could-a-war-get#Power_laws_and_conflict_data), which means spending should a priori be proportional to “deaths”^-1.6. This suggests spending to decrease deaths in wars 1 k times as deadly should be 0.00158 % (= (10^3)^(-1.6)) as large.

    1. David Thorstad Avatar

      Thanks Vasco!

      Yes, you are quite right that the relative amounts spent on severe versus less severe catastrophes cannot tell us much about what should be done with a marginal dollar. Much more is needed. Thank you for raising this point, and I’ve been happy to see your work on this subject.

      As you know, I think that effective altruists often overvalue existential risk mitigation efforts. I hope that this series will be one way of making the case that those efforts are overvalued, by recalling some of the harms that existential risk mitigation efforts can have and urging that those harms be taken into account.

  2. Jason Avatar
    Jason

    It seems to me that the main problem is that other elements of society aren’t taking near-term AI harms seriously. Do you think you would be saying the same things if some group strongly interested in a specific near-term harm (say, inequality) were plowing large sums into research and advocacy into that specific harm and ignoring the others? It’s unclear to me why we should expect longtermists to fund workloads within the ambit of equality-focused foundations, media organizations, and the like simply because they are concerned about one specific potential harm.

    I think one crux here is whether EA-style AI safety is bunk or not. If it’s bunk, the proper application of EA principles is to just get out of the area entirely. If there are no x-risks that can be cost effectively mitigated, then the funding should flow back to global health & development, animal advocacy, etc. (As you may recall, I would like to see more of a return to that kind of work in general.) If EA-style AI safety isn’t bunk, then I’d argue that it generally makes sense to leave most of these issues to others. So I’m having a hard time seeing the set of circumstances in which shifting to focus more on near-term AI harms would make sense. 

    Another crux is figuring out why AI companies are getting a pass on the near-term harms. In my view, “few people are talking about them” and “attention is being pulled away by AI safety people” are largely separate hypotheses, and I am significantly more inclined toward the former. In a thought experiment where AI safety folks just disappear, I think the AI companies would still be getting a pass on near-term harms. So I would diagnose the problem as largely flowing from the absence of people speaking about near-term harms. 

    To the extent that is true, I don’t think it is appropriate to charge increases in near-term harms to current longtermist priorities. I think you’re quite right to point out past missteps by EA and EA-adjacent actors. But I think there are relatively few people who believe Elon Musk is a force for positive change, or who wouldn’t shut down the major AI labs if given the chance. I don’t get the impression that people are, e.g., being encouraged to do work that increases AI capabilities nowadays. 

    Some of the near-term concerns are instrumentally useful to AI safety advocates and are likely to see funding on that basis. For instance, breaking up the big AI companies would slow down potential progress to AGI, so sounding the alarm on consolidation of corporate power would have value toward that intermediate goal.   

    We will have to see how the UK AI Safety Institute performs. The minister’s statement expresses concerns that AI might “concentrate unaccountable power into the hands of a few, or be maliciously used to undermine societal trust, erode public safety, or threaten international security.” It doesn’t sound like there is neglect of near-term harms here.

    As far as global relations, slowing down AI progress on the part of illiberal, undemocratic regimes like China sounds like a reasonable plan to mitigate many of the concerns you’ve expressed. At least in Western democracies, there is a democratic process that might care about things like privacy, inequality, and concentration of power. If we’re going to charge AI safety with the cost of increased global tension over export controls, we need to credit it with the near-term risk-reduction benefits too even if they were not the target results.

    I think the US legal system will likely prove adequate to protect content creators, including through class actions, and that there are enough politically powerful content creators out there to secure appropriate protections should the courts not work out. Although I feel for the little guy here, I think the big fish will protect their interests as a way of furthering their own.

    I’m relatively less concerned about environmental effects at this time — I believe the numbers provided are for all datacenters, not just AI. As you’ve noted in another post, the availability of specific chips creates a limit on the growth of AI, and so the growth rate is constrained by that. Also, AI work can be done anywhere, so can be concentrated in locations with lower cooling needs and greater access to renewable energy.

    1. David Thorstad Avatar

      Thanks Jason! It’s good to hear from you.

      I think we may agree on a lot, even perhaps on most things, while disagreeing on some things. Let’s see what you think.

      You’re definitely correct that many elements of society aren’t taking near-term AI harms seriously enough. Near-term AI harms are one of many ethical challenges that society fails to take seriously enough. And you’re definitely right that this fact strengthens the distraction problem. If society is already working quite actively (perhaps even too much) on a problem, it may not be so dangerous to distract from that problem, and it may even be a salutary change of subject. But when more attention needs to be paid to near-term harms, distraction can be quite damaging.

      The claim of this post is not that effective altruists are obligated to fund neartermist AI research. The aim of this series is rather to get clear on the harms of existential risk mitigation. To the extent that AI risk research tends to distract from near-term harms of AI, this distraction will be a cost that needs to be chalked up against the value of AI risk research. The only way to avoid chalking up that cost would be to do the things discussed in the last section of this post, but again that’s not a recommendation for EAs to do these things, just a reminder of the cost if they don’t.

      You’re certainly right that a major crux for me in many areas is the rigor of EA-style AI safety research and the existence of the underlying threats. I’d like to think that this crux can be separated from the separate question of what harms AI safety research might have, since those harms would still exist if I were wrong about the rigor of EA-style AI safety research and the existence of the underlying threats.

      I’m as dismayed as you are that AI companies often get a pass on near-term harms, and you’re quite right that there are other explanations beyond distraction for those companies getting a pass. That’s not to say that distraction doesn’t contribute. For example, OpenAI certainly does get significant public recognition for its work on superintelligence alignment, and for some time its reputation for ethics was so strong that the original draft of “The case for strong longtermism” suggested altruists might consider donating to OpenAI.

      I think you are certainly right that EAs would generally like to shut down the major AI safety labs. And while I think EAs often don’t focus enough on the immediate harms that Musk does, I wouldn’t go so far as to claim that EAs think Musk is a significant force for positive change.

      I am hopeful about the path to reconciliation that you mention: namely, near-termists and long-termists have an interest in many of the same types of research and policymaking. As you mention, challenging big AI companies is on both camps’ priority lists now. Work such as interpretability research and some of what goes under the name of `alignment’ research is also of interest to both camps. I hope this will give us more common ground in the future. I don’t like the status quo of just shouting at one another. (For what it’s worth, I was just at a conference with Seth Lazar, who now thinks roughly the same. There is hope for reconciliation).

      I have to admit that I have some hope for longtermist policies and government institutes to be co-opted by more moderate groups. I think EAs have been cognizant of this possibility at least since the UN Secretary General’s Office agreed to write considerations of existential risk into a number of high-level documents, but then went on to stretch the term `existential risk’ so that it covered most of the things they wanted to talk about anyways. If that’s right, there will be less risk of distraction and maybe even a positive contribution to near-termist priorities. I suspect effective altruists would not be best pleased to see their money, time and organizations used in this way, and that they have learned some lessons about politics in the past few years that may serve them well. But I wouldn’t be terribly disappointed if the co-option you describe occurred, and it does dovetail nicely with my suggestion that some actors, such as Rishi Sunak, may be taking effective altruists for a ride.

      I’m a bit concerned by the idea of sponsoring US policies that antagonize China. Even if we would rather not see China gain access to powerful technologies, we might also not want to fan the flames of what is shaping up to be the most important international rivalry of the next few decades. I’m also less confident than I used to be in the wisdom of the United States. Our democracy is not as strong as it used to be; a number of fringe theories have been mainstreamed in policy discussions; political parties are increasingly unable to compromise to get things done; and long-term policy continuity is increasingly difficult. And it is worth remembering that we are the only country to have deployed nuclear weapons in combat. I don’t know that I am indifferent between China and the US receiving powerful AI technologies, but all of this makes me less concerned about the difference. I don’t really trust the Pentagon with powerful AI systems either.

      I hope you are right about future legal action taking care of harms to content creators. I also hope that policymakers will step in to add new laws.

      I’m honestly very concerned about environmental effects. Even if, by some miracle, we were to begin hitting carbon-emission targets, we would be headed for a tough few centuries. The way things are going, we are going to do a lot of harm to the planet before we are done. You’re definitely right that AI is not, and probably will never be responsible for the majority of emissions. But they’re no longer an inconsequential player either.

      1. Jason Avatar
        Jason

        Yeah, I think we have a lot of points of agreement. I think one  significant point of departure may be that (conditioned on other segments of society getting more concerned about AI) I think the synergistic effects will likely outweigh the distraction effects. But I don’t know how to determine or reliably estimate the specific weight of the two effects.

        I think a good part of my reaction was to one specific sentence: “First and foremost, longtermists can fund and conduct research and advocacy around risks beyond existential risk.” In addition to concerns about expecting longtermists to do research and advocacy other AI downsides merely because they are involved in AI catastrophic risk, I suspect that longtermist advocacy about (e.g.) the equality/equity risks of AI would create the distinct aroma of freshly-cut astroturf. I don’t really want longtermists occupying that space; I want people from the affected communities and people who really are focused on equality/equity leading that charge. As I think I mentioned in relation to the EA animal advocacy movement at some point, throwing a relatively large amount of resources at a nascent cause area (compared to the amount of non-EA funding/activity already there) can create potential distorting effects on the cause area. Finally, I am skeptical that a group of mostly rich, white, nerdy elites is going be effective mobilizing the various segments of society that will need to activate to mitigate the near-term threats.

        I suspect I assess distraction risk as somewhat lower than you do. There isn’t too much overlap between potential funders and activists. Public attention is limited and fickle, but it’s not obvious to me why there is a specific amount of attentional bandwidth for “AI-related risks” such that various risks compete directly against each other. For myself, I’d guess that I have something analogous to separate attentional buckets for “catastrophic risk,” “justice/equality/equity corners,” “democratic health and control,” and so on. I may not be typical, although I think many members of the general public try to balance/diversify their attention in this kind of way. That being said, I would agree that everyone in this space needs to evaluate distraction risk and not be dismissive of other potential concerns without very good reason.

        I also think I am more optimistic about synergistic effects. In addition to the positive effect of slowing the AI companies down on all issues, going after Corporation 1 on Issue X should lower their public reputation. That should make it easier for me to succeed in going after them on Issue Y. For instance, issue X advocacy likely gives the corporation fewer friends in the legislature interested in defending them on issue Y. I think this may be particularly important in the US context (and that’s where the big AI companies are at the moment). Most of the issues you describe code as Team Blue, but ordinarily Team Red controls at least one of the House, Senate, or White House. AI spiralling out of control and killing everyone isn’t (at present) heavily partisan, and Team Red should be receptive to, e.g., concerns about AI-turbocharged military/paramilitary capabilities.

        I don’t think the US government is as worthy of trust as it used to be either. But in the export-control context, I think the question is whether we are safer in a world where US capabilities are significantly ahead of China, Russia, et al., or where illiberal nations have a more equal chance to develop those capabilities concurrently with the US et al. I think both I and the longtermists would be happy with an enforceable way to prevent military use of AI across the globe. From my understanding, that would require technological controls in the chips plus some sort of international monitoring regime to be plausible. If that’s the target endgame, it’s still probably not a good idea to ship uncontrolled chips to illiberal regimes in the interim.

  3. JWS Avatar
    JWS

    I recognise that you’re highly sceptical of both the magnitude of AI x-risk this century David, along with the value of AI Safety work. (FWIW I’m definitely one of the more sceptical AI-adjacent EAs, but I take it more seriously then you seem to). But I think that Jason’s right that this is the major crux underlying everything, and unfortunately, I don’t think you can really separate out the estimation of the value of the different programmes from the strength of the ‘distraction’ criticism.

    This is because the ‘distraction’ harm seems to directly be an application of the principle of Opportunity Cost. For any intervention done, or research funded, there will be a ‘harm’ of forgoing all other opportunities. The size of that harm necessarily depends on the two options being considered, in this case we can simplify as funding ‘near-term’ AI harms and ‘existential’ AI harms (though that’s not a good distinction, because much of the EA community seems to view existential risks as near-term harms). You could hold a neutral position here saying that you’re just pointing out that opportunity costs exist, but it ends up being a vacuous one imho. Just as there are harms from pursuing existential research instead of near-term research, prioritising the latter *also* includes its opportunity costs.

    I think in practice the ‘distraction’ criticism comes with the following claims attached:

    1 > Near-term AI research is substantial value e.g. funding research into the 8 harms you mention in section 3 would be good.
    2 > Long-term AI research has no substantial value
    3 > Any money/effort given to 2) will necessarily detract from 1)

    Now points 2/3 are ones you explicitly criticise in the opening, and rightly point out that this discourse can be polarising and unhelpful. But this is exactly what the Nature editorial that you start the piece off does – it literally contains no assessment of nor engagement with the proponents of work into existential risk. You seem to slip that way yourself too in Section 6, where you really seem to insinuate 2) here, such as “Thankfully, studies of existential risk from artificial intelligence have had a difficult time making inroads into academic discussions”, which seems really out of place is this is meant to be a more neutral accounting of potential harms of longtermist work.

    Furthermore, while I’m glad the Seth Lazar is open to some form of reconciliation, over 2023 my perspective was that the vitriol was almost exclusively one way, from those focused on near-term research to those on AI Safety work. I don’t think one can credibly claim that its a ‘both sides’ thing. I think many individuals in the AI Safety space have pointed out that many policy recommendations and research directions (e.g. stronger regulation on big tech, more attention to interpretability/explainability work) are the same between the two camps, and they have not been met with much friendliness from the other direction. At some point, one’s patience begins to wear thin, and even I who would generally consider myself on the pro-reconciliation side on the AI Safety side (as much as I can be said to be involved, it’s not my day job) have very much started to lose it.

    Now you could argue that this vitriol is worth if it it moves funding and attention from existential work to near-term harms, but again I’d like those who believe that to say so openly, and actually engage with arguments about whether their points are true rather than assuming they are.

    1. David Thorstad Avatar

      Thanks JWS!

      I hope we can separate the questions of (a) the benefits of AI x-risk mitigation work, and (b) the harms of AI x-risk mitigation work. This would mean that readers could disagree with me about (a) while agreeing about (b), or vice-versa.

      I think that the harms discussed in this post include, but are not exhausted by opportunity costs. If we were in a zero-sum game with a fixed pot of funding, then funding allocated to existential AI harms would incur a sizeable opportunity cost by depriving near-term harms of the funding on a 1:1 basis. There would, still, be further benefits/harms of existential risk research to be discussed, but you are right that a greater proportion of the harms would be opportunity costs. However, we are not exactly in a zero-sum game. A good deal of money (for example, effective altruist money) was never going to be allocated to near-term harms. Research on one subject can expand (or shrink) the pie of overall funding available for AI ethics/safety research, depending on how that research is viewed. And again, much of the distraction argument has nothing to do with who receives funding. Saying that corporations and politicians gain a reputation for virtue by funding x-risk research is not, in the first instance, making an opportunity cost argument.

      I think you are right that the distraction criticism comes with the claims (1)-(3) that you mention. I think it is largely unfortunate for this reason, because much of the `AI risk is a distraction’ crowd really wants to say (1) and (2), but instead asserts and argues for (3). That’s one reason why effective altruists have not reacted well to the argument. It’s why I took so long to address the argument and why, as you mention, I took care to distinguish myself from those who don’t run the argument very well. I include the Nature editorial in the camp of those who did not argue well.

      I don’t know that I want to suppress my own opinion of AI risk in Section 6, or elsewhere. I have been vocal about this opinion and given reasons for it elsewhere, and I will be happy to give further reasons as the AI safety crowd begins to produce a broader body of high-quality academic publications in leading journals. I do think it is important to treat points (1)-(3) as separate points, and I do aim to do that here and elsewhere.

      You’re certainly right that the previous vitriol over AI safety research was fairly one-sided. That was bad. I don’t like vitriol. I don’t think it’s kind or productive, and in this case I think it was used to mask a lack of argument. I hope we are done with that. I suspect there may be a long way to go, but progress is progress.

      To be slightly fair, there was a great deal of talk and behavior coming out of EA communities in these years that many would regard as deeply offensive. EAs do a much better job than their opponents of avoiding vitriol directed at particular arguments or positions, but EAs did also speak and behave in ways that provoked legitimate outrage. I have written about some of these, and I will continue to write about them. I think that these thoughts and behaviors can, and should be separated from scientific and philosophical discussions of AI risk, and I do think that some AI ethics folks should have made a broader separation here. That was bad, and they should (and hopefully will) do better.

      1. JWS Avatar
        JWS

        So I think there may well be harms from AI x-risk work, but what I want to draw a distinction between is the overall harms and the specific harms we could call ‘distraction’. Some harms might be direct (actually building a harmful AI system), some might be indirect (providing leverage for large corporations use their corporate power against the public interest), and some might be opportunity costs (the net benefit foregone by not pursuing another course of action). I think ‘distractions’ should be mostly restricted to that former category, and that section 3 could maybe have been better as a ‘Part 0’ post setting out the potential different harms from AI x-risk work.

        You say that in a fixed pot of funding scenario, funding AI x-risk work would incur an opportunity cost from not funding work on near-term harms, but symmetrically funding work on near-term harms would incur the opportunity cost of not funding AI x-risk work. The only way to break this symmetry is to actually look at what the evidence for each piece of work is (and in practice, I’m not always sure there’s such a clean distinction in practice. I think interpretability work can benefit both agendas, for example). This brings us back to the object level question of what the x-risk from AI is, and to Jason’s point that this is really the whole crux of the matter, and talk of harms from ‘distraction’ are downstream of the position we take here.

        Now while I think you’ve done a lot of due diligence on this, and I wouldn’t want you to surprise your opinion/scepticism, many on the x-risk sceptical side have not done this and simply pattern-matched it to a ‘sci-fi’ concern. While some EAs may have acted in ways others have found offensive, again I want to hold a line that bad behaviour is not even, and that if reconciliation between the AI x-risk and non-risk camps that are estranged is to happen, the primary moves toward that need to come for the latter and not the former. That’s probably a topic for another post/time, but I’m pretty sure I can find a lot of receipts from various researchers acting in a very poor way to those in the x-risk camp.
        I look forward to your response and the rest of the series, and as always I appreciate your work 🙂

  4. EG Avatar

    Does attention on existential risk actually distract from current harms from AI? I looked around for empirical evidence on this and I lean towards no, at least not so far. Key organisations working on mitigating current harms seem better funded than ever, key advocates for current harms seem to be getting more attention than ever, in the US there are many bills being introduced in Congress focusing on current harms, etc. (Of course there’s a risk of motivated reasoning here as I work on AI policy with a focus on more speculative risks.) I wrote up my findings here, with plots and numbers and so on: https://www.erichgrunewald.com/posts/attention-on-existential-risk-from-ai-likely-hasnt-distracted-from-current-harms-from-ai/

    Here is an excerpt of the summary:

    > The claim that x-risk distracts from current harms is contingent. It may be true, or not. To find out whether it is true, it is necessary to look at the evidence. But, disappointingly, and despite people confidently asserting the claim’s truth and falsity, no one seems to have looked at the evidence. In this post, I do look at evidence. Overall, the data I look at provides some reason to think that attention on x-risk has not, so far, reduced attention or resources devoted to current harms. In particular, I consider five strands of evidence, none conclusive on its own:
    >
    > – Policies enacted since x-risk gained traction
    > – Search interest
    > – Twitter/X followers of AI ethics advocates
    > – Funding of organisations working to mitigate current harms
    > – Parallels from environmentalism
    >
    > I now do not think this is an important discussion. It would be better if everyone involved discussed the probability and scale of the risks, of current harms or x-risk, rather than whether one distracts from another. That is because agreement on the risks would dissolve the disagreement over whether x-risk is a distraction, and disagreement over distractions may be intractable while there are such strong disagreements over the risks.

    1. David Thorstad Avatar

      Thanks Erich!

      It’s certainly desirable to support views about all matters with evidence, and the extent of distraction is no exception. I must say that I am surprised to hear you (if I am reading you right) going for the relatively strong position that existential risk concerns do not distract in any way from neartermist concerns. I would think a more plausible position would suggest that perhaps there is some distraction going on, but it is not as large as others have suggested.

      The post you linked discusses four strands of evidence that could settle the question of whether existential risk concerns distract from near-termist concerns (the fifth strand is making a mostly orthogonal point). Three of the four strands of evidence (policies enacted, Twitter data, funding data) are not discussed in a way that bears significantly on the distraction argument. In these areas, you argue that the relevant metric (say, Twitter engagement) has risen simultaneously for those concerned with existential risk and near-termist harms. That isn’t what’s at issue: the distraction argument says that the relevant metrics (say, Twitter engagement) would be higher for neartermist concerns if existential risk concerns were not distracting from neartermist concerns, not that the extent of distraction is so great as to negate any grow in near-termist metrics. Given the recent spike of attention in most aspects of artificial intelligence, it would be deeply surprising for the extent of distraction to be so great as to eliminate recent gains in Twitter engagement, funding, and other metrics, and may still be concerning if the extent of distraction is lower than this.

      The fourth strand of evidence (search interest) is discussed in exactly the right way, through a causal model that attempts to address the extent to which interest in existential risk has causally contributed to disinterest in near-term harms. This is a good start. Could you say more about the model? So far, we’ve seen the model outputs and a brief high-level description of the model. I’d be happy to discuss the model in more detail if you could provide the full model. I think that this is the kind of evidence I would like to see more of.

      Some smaller comments:

      (1) I agree with your call to provide concrete mechanisms by which distraction takes place. I wonder if you would be willing to say more about the mechanisms proposed in this post. This post discussed three ways in which distraction may occur: corporations may use work on longtermist harms to gain a reputation for virtue; public attention may be depleted or diverted by longtermist concerns; and academic research may be directed by the availability of funding. In your post, you address one of these mechanisms (attention depletion) though not to say much more than that we will need further evidence to settle the matter. Could you say more about the other two mechanisms and also perhaps say a bit more about why you find the attention depletion argument unsatisfactory? Is this driven by your causal model of search interest, or something else? Likewise, it might be worth discussing a mechanistic story of how the predicted distraction fails to occur. Are the mechanisms cited by your opponent rendered inoperable? Overcome by a more powerful force? Did they never exist in the first place? Something else?

      (2) I’m not sure why you claim that agreement on risks would dissolve the disagreement over whether existential risk is a distraction. It’s perfectly possible for all forms of distraction mentioned above to occur in a climate of agreement over risks. Corporations might still gain a reputation for virtue by addressing longtermist causes and use that reputation to hide other harms, including but not limited to failure to address short-termist ills. Public attention might still be driven in one direction or another, even given a shared body of belief about risk levels: this is what it means to say that attention, not merely information, drives action today. And academic attention might be driven by competition for funds, rather than by the beliefs of the researchers (which, it should be noted, might also be shifted post-hoc through a motivated desire to appease funders or express views consistent with the author’s own funded publications).

      1. EG Avatar

        Thanks for the response, David, and for reading my post!

        > I must say that I am surprised to hear you (if I am reading you right) going for the relatively strong position that existential risk concerns do not distract in any way from neartermist concerns. I would think a more plausible position would suggest that perhaps there is some distraction going on, but it is not as large as others have suggested.

        To be more precise, I’d add some caveats and say my (tentative) position is “attention on AI x-risk so far seems not to have meaningfully counterfactually distracted from current harms from AI”. I don’t think it’s that strong a position, by the way. The two views are somewhat in opposition, or at least generally seen that way, and I think this is just as likely to be mutually reinforcing as mutually distracting. For example, I would not say it’s a strong position to guess that attention on abortion rights doesn’t distract from attention on anti-abortion views — on the contrary, they probably reinforce each other to some degree. One way this may happen could be by raising the salience of the broader topic (abortion/AI), another that it provides opportunities for critique and counterargument. But also, there are just tons of other issues to draw attention from — it’s not clear to me that people and media have a fixed “AI attention budget” in that way. Why should we think they do?

        You are right, of course, that there’s a major confounding variable here: attention on AI in general. I try to control for that in various ways, and as you know mention it repeatedly in the post. One is the causal model you mention (see below; although I don’t put too much weight on it, it should be seen more as a first attempt). But also: search interest in x-risk rapidly decreased around June 2023, while interest in AI generally was flat or increased, but neither interest in current harms nor interest in AI ethics advocates or orgs (at least those I looked at) increased at this time. If attention on x-risk was a distraction, they should’ve gotten more attention at that time, even if attention on AI in general boosts both (since it was mostly flat then).

        And as I say about the funding data, “if you thought that x-risk is drawing resources away from current harms, and imagined a range of negative outcomes, then this should at least make you think the worst and most immediate of those outcomes are implausible, even if the moderately bad, and less immediate, outcomes still could be.” Like, even if there are potential confounders, I think each of these different data points should update us at least a little bit in the direction of no distraction.

        On the causal model: the causal DAG is explained in the post (“interest in AI -> interest in x-risk -> interest in current harm I agree with your call to provide concrete mechanisms by which distraction takes place. I wonder if you would be willing to say more about the mechanisms proposed in this post. This post discussed three ways in which distraction may occur: corporations may use work on longtermist harms to gain a reputation for virtue; public attention may be depleted or diverted by longtermist concerns; and academic research may be directed by the availability of funding.

        Yes, I should’ve said before that I think that’s a great service and a credit to you!

        Responding quickly to each in turn:
        1. *Corporations using x-risk to avoid doing anything about current harms.* I think that could be a thing, not only for current harms but also for x-risk (“safety washing”). But the tactic of “reduce pressure on these current harms from our product by pointing to these more speculative, but way worse (if true) harms of our product” just doesn’t seem to be a tactic that’s used by companies? It’s certainly not what oil companies opted for. It’s not what social media companies opted for. I can’t think of any past example of this happening. What’s different about AI? (And even when it comes to AI, it’s a tactic that companies like Meta, or techno-optimists like Marc Andreessen, are opting for.) Second, if the goal is to reduce pressure to mitigate current harms, why are AI labs spending so much money on stuff that’s not very PR-y? Things like the labs’ considerable AI safety teams that just quietly publish very weird and in-the-weeds safety research, and (as you mention) OpenAI committing 20% of its compute to its superalignment effort, etc. Finally, AI labs also seem to be doing a fair bit to mitigate some current harms? For example, they seem to try hard to avoid their products appearing racist or sexist and the like. (I’m not saying that’s because their pure and virtuous, I’m just saying they at least sometimes, but in my view fairly often, reduce pressure to mitigate current harms by actually doing what advocates for current harms want.)
        2. *Public/policymaker attention is a scarce resource.* As I said above, I think it’s true that public attention is not unlimited, it’s just not obvious to me that x-risk and current harms from AI should trade off against one another (i.e., that there’s a limited “AI risk attention budget”). You don’t really provide any reasons (as far as I can tell) to think this is so, or at least not more so for AI x-risk and current harms than for any other two policy issues, say, immigration reform and voting rights. There are lots of other topics they can draw attention from.
        3. *Academic research may be directed away due to funding.* I know less about the incentive structure in academia so I have less to say about this one. But it also seems to be the one you’re less worried about, so maybe that’s fine.

        > I’m not sure why you claim that agreement on risks would dissolve the disagreement over whether existential risk is a distraction. It’s perfectly possible for all forms of distraction mentioned above to occur in a climate of agreement over risks.

        I don’t think it would be impossible to see arguments over distractions if there were full agreement on the risks. I just think it’s far less likely/frequent. I base this on other domains where there is much more agreement (e.g., environmentalism), where different risks are somewhat in tension with one another (e.g., biodiversity conservation versus renewable energy), and where you still see very little argument around distractions. Of course you see greenwashing and so on. But when BP engages in greenwashing, I don’t think it’s to distract from climate change or other environmental issues so much as to convince people that they are fulfilling their obligations with regard to those issues.

        1. EG Avatar
          EG

          Hmm, it seems a chunk of text went missing. This:

          On the causal model: the causal DAG is explained in the post (“interest in AI -> interest in x-risk -> interest in current harm I agree with your call to provide concrete mechanisms by which distraction takes place. […]

          is missing some text, which I think was interpreted as HTML due to the less/greater than signs. Unfortunately, that was lost when the HTML was sanitized by the commenting system. It should’ve read something like (and feel free to edit it):

          On the causal model: the causal DAG is explained in the post (“interest in AI -> interest in x-risk -> interest in current harm [left arrow] interest in AI”). Statistically, I model it (following Judea Pearl) as a simple linear model, with an outcome (a current harm, advocate, or org), a predictor (x-risk), and a set of unobservables to condition on (IIRC, interest in ChatGPT and interest in Bing). I could share the code if you want, but there’s not much information there that’s not in this paragraph, there’s nothing fancy, really.

          > I agree with your call to provide concrete mechanisms by which distraction takes place. […]

          1. EG Avatar
            EG

            And while I’m at it, “And even when it comes to AI, it’s a tactic that companies like Meta, or techno-optimists like Marc Andreessen, are opting for.” should read “And even when it comes to AI, it’s a not tactic that companies like Meta, or techno-optimists like Marc Andreessen, are opting for.”

Leave a Reply

Discover more from Reflective altruism

Subscribe now to keep reading and get access to the full archive.

Continue reading