Follow

I think it would be extremely cool if the "categorical cybernetics" bag of methods could say something about the relationship between inner and outer models - in particular, the observed fact that transformers learn gradient descent as one of the steps in their algorithm!

@julesh @bgavran @mc

@ayegill @julesh @bgavran @mc maybe our paper can help somehow? You guys are probably a million miles beyond it already, but at least we have internal models in there: arxiv.org/abs/2112.13523 (see also arxiv.org/abs/2209.01619) (cc @martinbiehl)

@ayegill @julesh @bgavran @mc @martinbiehl (I mean, it won't help with understanding transformers or anything, but maybe with the fundamental question about internal vs. external models.)

@Nathaniel @ayegill @julesh @bgavran @martinbiehl absolutely Nathaniel, I love that paper. I definitely think it's the right direction. I'm working myself on a framework for goal-oriented systems inspired by it. Anyway, it'd be great to talk (tagging @manuelbaltieri too)!

@mc @ayegill @julesh @bgavran @martinbiehl @manuelbaltieri thanks, it's really made my day knowing that you're inspired by it. I'd be really happy to talk soon!

@mc @Nathaniel @ayegill @julesh @bgavran @martinbiehl Thanks! There's a double category lurking somewhere (or is it a triple one now?), and (cross-conversation) the PI(D) you are building should be the prime example of an "internal model" in control (roughly, linearise the dynamics of a system and approximate the cost function locally with a quadratic polynomial).

Can't wait to be done with all this admin stuff and go back finishing that (still rough) idea, borrowing hopefully some help. ;)

@ayegill @julesh @bgavran that's something to think about for sure. It'd be great to do so with you!

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one