Hm. I think the type of philosophy/math/cs needed for successful strawberry alignment is close enough to regular theorem-proving that AI systems that aren't seeds for worldcrunchers would still be very helpful.
(Doesn't feel to me like it touches the consequentialist core of cognition, a lot of philosophy is tree-traversal and finding inconsistent options, and math also feels like a MCTS-like thing)
Is the advantage we'd have by good alignment theorist ML systems 1.5x or 10x or 100x?
If we had those widely distributed, people would likely use them for capabilities and just widen the gap (e.g. OpenAI who talk about this as a strategy are not to be trusted with that strategy, since I don't see them using it solely for alignment work for half a year, and instead using it on both capabilities and alignment. But their plan is sound in that regard).
But I disagree with the view that you can't have the alignment theorist that is not also a consequentialist.
The, ah, fifth thing I disagree with Eliezer about.