Follow

Characteristically open-minded + grounded post from @fhuszar on the deep learning shock to learning theory and the looming possibility of an LM shock

inference.vc/we-may-be-surpris

" 'theory predicts deep learning shouldn't work, but it does, therefore our theory is insufficient.' This seems almost trivial now, but it represented a massive shift... the theory needed fixing, not deep learning... It may have been alchemy, but some actual gold was produced."

"in 2018... I was shown a preview of... key GPT results: the ability to solve problems it wasn't explicitly trained [to do]... My immediate reaction was that this can't possibly work... this approach will never be even nearly competitive with specialised solutions"

"[Previously] I said if your objective function doesn't reflect the task, no amount of engineering or hacks will help you bridge that gap...

I have now abandoned this argument as well... we have barely a clue what inductive biases SGD on a model like GPT-3 has..."

"the fact we can't describe it doesn't mean unreasonably helpful inductive biases can't be there. evidence is mounting that they are.

As intellectually unsatisfying as this is, the LLM approach works, but most likely not for any of the reasons we know. We may be surprised again"

I also only now got the joke of his blog TLD:

.vc references Vapnik-Chervonenkis, not venture capital.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one