the problem is effectively designed to capture intuitive reasoning, which typically fails to arrive at the presumably "correct" answer of 1box; it attempts to demonstrate the utility of formal reasoning/decision theory in certain situations

(stole this variant from a🔒🐦)

imo the correct path of reasoning here is revealed by the question statement: that the Predictor has never been wrong. how could this be the case? the answer is telling: this could only be the case if they could simulate you perfectly, ie, use the exact same reasoning as you will

the Predictor would have to be a superintelligence capable of scanning, digitizing, & simulating you in a sufficiently convincing representation of the real world that the copy believes its the original, trying to make a decision. if done properly, the result should be identical.

this experiment combines aspects of the Veil of Ignorance & Roko's Basilisk; to solve it correctly, you need to figure out a way to TRULY believe that 1boxing is the correct answer, accepting that deception is impossible, as you don't know if you're being simulated currently.

Follow

you have a superintelligence powerful enough to outthink you 10 times out of 10, and you can't have certainty as to whether your attempt at deception will be used against you; most people's intuitions are not equipped to handle this situation, nor naive decision theory either tbh

as such, this problem is really testing how willing one is to slowly think through the possibilities, rather than jumping to a short-term satisfying but long-term suboptimal solution.

and just like the marshmallow experiment, it's really a test of trust in the problem statement

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one