Has anyone tried to train neural networks to predict sudden drops in loss of LLM training?

We surely can observe many scaling curves from many different tasks

@niplav What kind of loss is that? How does that happen?

Follow

@Paradox loss as in predictive loss, a simple way of measuring predictive accuracy

We want loss to be as low as possible, because that corresponds to good performance

Sometimes capabillities emerge with sudden falls in loss, sometimes loss doesn't change much, we'd like to know when these capabilities will change or loss declines sharply

@niplav This is a feature of LLMs I'm not educated on. I know the basics of such systems.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one