Characteristically open-minded + grounded post from @fhuszar on the deep learning shock to learning theory and the looming possibility of an LM shock
"in 2018... I was shown a preview of... key GPT results: the ability to solve problems it wasn't explicitly trained [to do]... My immediate reaction was that this can't possibly work... this approach will never be even nearly competitive with specialised solutions"
"[Previously] I said if your objective function doesn't reflect the task, no amount of engineering or hacks will help you bridge that gap...
I have now abandoned this argument as well... we have barely a clue what inductive biases SGD on a model like GPT-3 has..."