Follow

Maybe it's true that intelligence depends on the environment, but consider: the environments where policy iteration performs better than RL with temporal difference learning are kind of dumb.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one