**niplav** @niplav@schelling.pt · 2021-11-15T15:59:15Z

niplav @niplav@schelling.pt

Actually, why do people believe that corrigibility is possible?
Maybe there's no aixified version of it, and you can only do it if you believe that there's a smarter entity

Nov 15, 2021, 15:59 · · · ·

**niplav** @niplav@schelling.pt · Nov 20, 2021, 15:10

**niplav** @niplav@schelling.pt · Nov 20, 2021, 15:10

Nov 20, 2021, 15:10

niplav @niplav@schelling.pt

Hm, this is similar to what I feared https://www.lesswrong.com/posts/WCX3EwnWAx7eyucqH/corrigibility-can-be-vnm-incoherent

**Urusan** @urusan@fosstodon.org · Nov 15, 2021, 22:35

**Urusan** @urusan@fosstodon.org · Nov 15, 2021, 22:35

Nov 15, 2021, 22:35

Urusan @urusan@fosstodon.org

@niplav I think the belief stems from how humans seem to function. People can learn to be better people (or worse people for that matter).

If humans are corrigible, then surely other intelligences could be too.

That said, are humans actually corrigible?

It's hard to prove that either way because we don't know what the terminal human values are, or even if humanity has shared terminal values, and irrational behavior sends constant mixed signals.

**niplav** @niplav@schelling.pt · Nov 16, 2021, 10:55

**niplav** @niplav@schelling.pt · Nov 16, 2021, 10:55

Nov 16, 2021, 10:55

niplav @niplav@schelling.pt

@urusan
Yeah, that makes sense. But then again, humans are not very coherent in their goals & cognitive strategies (heap of biases forming homunculus), maybe you can only be corrigible if you're incoherent

Trending now

Resources

Developers

What is Mastodon?

schelling.pt

More…