done reading https://arxiv.org/abs/2204.05212
weird, unexpected result
= setup =
give a human a reading comprehension task about a looong sci-fi story, with A/B options. give them 2 arguments with supporting quotes, one arguing for A, one for B. give them 90 seconds to read both arguments & quotes, and consult the text.
measure how often they pick the right option.
= result =
there's **no difference** between showing them just the quotes, or quotes+arguments.
@agentydragon someone needs to review the entire evidence here
In general havong Debate be not robust enough to ~always work (see also Obfuscated Arguments) is a bad sign