Category: Academic Papers
-

Instrumental convergence and power-seeking (Part 4: Conclusion)
This post draws lessons from our discussion of instrumental convergence and power-seeking
-

Instrumental convergence and power-seeking (Part 3: Turner et al.)
The most-discussed modern power-seeking theorem, due to Alex Turner and colleagues, also won’t do the trick
-

Papers I learned from (Part 6: A timing problem for instrumental convergence)
Should we expect means-end rational agents to preserve their goals? Southan, Ward and Semler are skeptical.
-

Instrumental convergence and power-seeking (Part 2: Benson-Tilsen and Soares)
A leading power-seeking theorem due to Benson-Tilsen and Soares does not ground the needed form of instrumental convergence
-

Instrumental convergence and power-seeking (Part 1: Introduction)
Power-seeking theorems aim to formally demonstrate that artificial agents are likely to seek power in problematic ways. I argue that leading power-seeking theorems do not succeed.
-

The scope of longtermism (Part 5: A case study – Existential risk)
Many longtermists think that existential risk mitigation escapes the scope-limiting factors. To what extent is this true?
-

Papers I learned from (Part 5: Language agents reduce the risk of existential catastrophe)
Simon Goldstein and Cameron Domenico Kirk-Giannini argue that language agents reduce the risk of existential catastrophe from artificial intelligence.

