Large Language Model Predictions:
@WomanCorn This feels quite true to me. (Where "new paradigm" could also just be "better activation function found").
@WomanCorn
Hm. This feels too pessimistic ("pessimistic") to me.
I guess if I take LLM very narrowly, then yes, we're running out of training data. But we have much much video data {{cn}} and can much more easily generate more, *and* I have an inkling that there's some alpha left in generating training true training data+doing RLHF with real-world prediction.
I guess I think we can probably reduce the confabulation problem enough so that it doesn't matter *as much*.
@WomanCorn
Especially if we get good mechanistic interpretability, there'd be some nice boundary conditions to use during training ("oh, this model clearly still has circuit xyz, maybe show it datapoint 67559438 a couple more times so that it learns geography better", or even directly editing networks).
@niplav I'm not sure how much of the magic of LLM is that the input and output are both text.
If we can get something that learns from videos, they may be more value in that.
I expect that the text -> art bots will have similar limitations, but probably decoupled from the text -> text ones.
The babble problem will not be solved. Effectively ever. It cannot be solved without a major change in architecture.