AI risk Archives - Page 2 of 2 - Reflective altruism

Instrumental convergence and power-seeking (Part 2: Benson-Tilsen and Soares)

A leading power-seeking theorem due to Benson-Tilsen and Soares does not ground the needed form of instrumental convergence

June 27, 2025

Instrumental convergence and power-seeking (Part 1: Introduction)

Power-seeking theorems aim to formally demonstrate that artificial agents are likely to seek power in problematic ways. I argue that leading power-seeking theorems do not succeed.

May 16, 2025

The scope of longtermism (Part 5: A case study – Existential risk)

Many longtermists think that existential risk mitigation escapes the scope-limiting factors. To what extent is this true?

May 2, 2025

Papers I learned from (Part 5: Language agents reduce the risk of existential catastrophe)

Simon Goldstein and Cameron Domenico Kirk-Giannini argue that language agents reduce the risk of existential catastrophe from artificial intelligence.

March 21, 2025

Papers I learned from (Part 4: Why AI systems may not evolve selfishness)

Maarten Boudry and Simon Friederich argue that natural selection may not produce selfish artificial systems

February 7, 2025

Exaggerating the risks (Part 16: Biorisk from LLMs, continued)

This post continues my investigation of biorisk from LLMs by looking at a recent redteaming study from the RAND Corporation.

May 3, 2024

Exaggerating the risks (Part 6: Introducing the Carlsmith report)

Many effective altruists believe that artificial intelligence poses a significant existential risk in this century. Let’s look at how a recent report by Joe Carlsmith makes this point.

April 8, 2023

Tag: AI risk