Tag: Goal misgeneralization

If anyone builds it, everyone dies (Part 2: Evolutionary mismatch)

This post addresses the first of Yudkowsky and Soares’ two main arguments for misalignment in Chapter 4.

January 23, 2026
Papers I learned from (Part 5: Language agents reduce the risk of existential catastrophe)

Simon Goldstein and Cameron Domenico Kirk-Giannini argue that language agents reduce the risk of existential catastrophe from artificial intelligence.

March 21, 2025