Actually, why do people believe that corrigibility is possible?
Maybe there's no aixified version of it, and you can only do it if you believe that there's a smarter entity
@niplav I think the belief stems from how humans seem to function. People can learn to be better people (or worse people for that matter).
If humans are corrigible, then surely other intelligences could be too.
That said, are humans actually corrigible?
It's hard to prove that either way because we don't know what the terminal human values are, or even if humanity has shared terminal values, and irrational behavior sends constant mixed signals.
@urusan
Yeah, that makes sense. But then again, humans are not very coherent in their goals & cognitive strategies (heap of biases forming homunculus), maybe you can only be corrigible if you're incoherent
Hm, this is similar to what I feared https://www.lesswrong.com/posts/WCX3EwnWAx7eyucqH/corrigibility-can-be-vnm-incoherent