Follow

use known inconsistencies of human preferences as value-learning trip-wires: if the value learning algorithm hasn't learned them yet, it's operating at the wrong level of abstraction.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one