Tag: Goal misgeneralization
-

If anyone builds it, everyone dies (Part 2: Evolutionary mismatch)
This post addresses the first of Yudkowsky and Soares’ two main arguments for misalignment in Chapter 4.
-

Papers I learned from (Part 5: Language agents reduce the risk of existential catastrophe)
Simon Goldstein and Cameron Domenico Kirk-Giannini argue that language agents reduce the risk of existential catastrophe from artificial intelligence.