Follow

Wait, do neural networks implement a sensible prior?

(like the speed or simplicity prior?)

If yes, which one?

@niplav The closest thing to a simplicity prior is a regularization term in the loss function with a penalty for large weights.

Having a large range leads to a sort of ordinal hierarchy of floats with some things never being able to interact gain. So overfitting i.e. memorizing restricted cases i.e. higher complexity.

@mira
Makes sense, hadn't connected regularization with simplicity.

(tho I don't think I understand regularization enough yet to understand why it'd result in simplicity).

@niplav Don't they just pour a bucket of neural nodes over two buckets of training data and hardcode the bad stuff away?

@rune I think that we mostly can't reach into the resulting buckets of neural nodes and change stuff we don't like.

@niplav I was thinking they just have a word filter in front of the output and hardcode some default responses

@rune Ah, that's what you meant. Seems true 👍

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one