and we knew already that adversarial examples are often likely features, not bugs: https://arxiv.org/abs/1905.02175
https://nitter.hu/giannis_daras/status/1531693104821985280#m verdict: doesn't feel surprising to me. adversarial examples apparently have some structure, which is (some) evidence against the natural abstraction hypothesis. we will end up having to extrapolate concepts
signalboosting this: https://old.reddit.com/r/mlscaling/comments/uznkhw/gpt3_2nd_anniversary/
insightful, as usual, with some strong predictions
I operate by Crocker's rules[1].