10 years ago a lot of AI Safety discussion turned on distilling humanity's meta-ethics into a machine-readable form. Today our most impressive AI's approximately reflect all the human content we could find for them, encoded in a semantically-meaningful way. We can convey intuitive preferences to the machines now. We can't guarantee that they'll actually optimize on those preferences, but the fact that the concepts are available seems under-discussed.
@ciphergoth (1) seems much more tractable now than I would have expected prior to the advent of LLMs. Old LW lore often invoked evil genies who would follow the letter but not spirit of the utility function. It seems like that problem is ~approaching solved. (2) still looms large, of course - but if we've stumbled into almost solving the evil genie subproblem, that seems worth celebrating.
@jai The problem was never that the AIs wouldn't understand human values; it was always that we didn't have a good way to point at that understanding.