1. Solving easy problem of corrigibility (while still adhering to vNM axioms)
2. Interpretability to the level that we can extract a hand-coded algorithm from AlphaFold 2, and similar feats (maybe in total 100 bio. invested in interpretability or sth?)
3. Make GPT-4 never say *anything* violent
4. Formula for embedded diamond maximizer
Would at least make me think "ok, I should probably focus on other things"