glaebhoerl: "AI, language models, understanding & programming"

AI, language models, understanding & programming

> attention models are incapable of understanding, all they can do is predict words sequentially

It's a widespread misunderstanding, but this is not correct! It confuses interface with implementation, incentive with strategy.

On the outside, a model predicts one word at a time. On the inside, no one really knows what goes on. For all we know, it could be e.g. deciding between candidate sentences, before returning only the next word.

**glaebhoerl** @glaebhoerl@schelling.pt · Feb 28, 2023, 17:48

**glaebhoerl** @glaebhoerl@schelling.pt · Feb 28, 2023, 17:48

Feb 28, 2023, 17:48

AI, language models, understanding & programming

And models are "trained" -- that is, semi-randomly and iteratively selected from among all possible models -- to be good at prediction, but once again, how exactly they end up accomplishing that is opaque to us. All those billions of weights could in principle be encoding any number of different algorithms.

More on this if you're interested:
https://benlevinstein.substack.com/p/how-to-think-about-large-language
https://benlevinstein.substack.com/p/whats-going-on-under-the-hood-of#%C2%A7representing-truth

**azul** @typeswitch@tech.lgbt · Feb 28, 2023, 18:19

**azul** @typeswitch@tech.lgbt · Feb 28, 2023, 18:19

Feb 28, 2023, 18:19

AI, language models, understanding & programming

@glaebhoerl

> a model predicts one word at a time

so, it predicts words sequentially.

**azul** @typeswitch@tech.lgbt · Feb 28, 2023, 18:28

**azul** @typeswitch@tech.lgbt · Feb 28, 2023, 18:28

Feb 28, 2023, 18:28

AI, language models, understanding & programming

@glaebhoerl to be clear, the issue with these models isn't the "sequential" part, it's the "words" part.

you can't feed a brain on words alone & expect it to develop human understanding & empathy. a quadrillion words aren't going to make up for the lack of embodied living experience.

**glaebhoerl** @glaebhoerl@schelling.pt · Feb 28, 2023, 22:14

**glaebhoerl** @glaebhoerl@schelling.pt · Feb 28, 2023, 22:14

Feb 28, 2023, 22:14

AI, language models, understanding & programming

> so, it predicts words sequentially.

it does do that

> you can't feed a brain ... embodied living experience

that's a super interesting question and I'm inclined to agree with you, albeit confidence is low and getting lower

programming and math are among the things I'd expect it's *most* possible to understand without that, though

...which you also say in toot #2, but seemingly the opposite in toot #1; could you explain?

**azul** @typeswitch@tech.lgbt · Mar 01, 2023, 01:22

**azul** @typeswitch@tech.lgbt · Mar 01, 2023, 01:22

Mar 01, 2023, 01:22

AI, language models, understanding & programming

@glaebhoerl my point there is that PLs have well defined semantics, and it seems plausible for a model to understand or incorporate these semantics in some way.

but that current models aren't that. they don't have any knowledge of PL semantics, they were just trained on piles of human code.

i would like to see models that are trained to detect mistakes, or to synthesize formally correct code, or...

**glaebhoerl** @glaebhoerl@schelling.pt · 2023-03-01T13:06:02Z

AI, language models, understanding & programming

@typeswitch Hmm, it does seem like a stretch to infer semantics purely from source code, without any access to e.g. outputs or specs. Probably there is _some_ of that though in the form of comments and tests. And idk how much docs it gets.

From quick googling it also seems like it's based on top of GPT3, whose training data may contain all that other kind of stuff (e.g. articles on PL semantics) as well.

Mar 01, 2023, 13:06 · · · ·

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 01, 2023, 13:07

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 01, 2023, 13:07

Mar 01, 2023, 13:07

AI, language models, understanding & programming

So _in principle_ I think it _could_ be capable of understanding semantics, as it is. Whether it _does_, in the absence of advanced mind reading techniques, we can only try to infer empirically. (I'd guess general impressions are that it doesn't, but haven't looked closely.)

Of course I agree that a model deliberately trained that way would be more likely to attain understanding (slash likely to attain deeper/better understanding).

**azul** @typeswitch@tech.lgbt · Mar 01, 2023, 14:56

**azul** @typeswitch@tech.lgbt · Mar 01, 2023, 14:56

Mar 01, 2023, 14:56

AI, language models, understanding & programming

@glaebhoerl I don't agree that it could in principle understand PL semantics. That rests on the assumption that it understands human language, which I think is a much wilder claim & one I don't believe (see rest of thread).

Whereas things like IDEs and compilers do understand PL semantics to some extent (e.g. what is being defined & where), enough to be useful. I think it's a matter of time until someone figures out how to combine that kind of understand with machine learning in a useful way.

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:57

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:57

Mar 02, 2023, 17:57

AI, language models, understanding & programming

@typeswitch (To be clear, I'm not making any claim about what any actually existing models actually do or don't understand.)

Re-stating, the claim is that an LLM trained on this kind of corpus could, potentially, gain some understanding of math and programming and English sentences involving them.

I have no idea what it could "understand" about things with real-world referents, like cats or Chicago, but math & PL don't need much if any of that.

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

Mar 02, 2023, 17:58

AI, language models, understanding & programming

Our concepts, symbols, and manipulations in math/PL are basically abstracted from the way things work in the physical world, e.g. what the number 2 means, what it means to have two of something.

If you *start out* at that level of abstraction with just symbols, obviously you can't project back down to the real world, but as long as you don't need to, it should be fine.

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

Mar 02, 2023, 17:58

AI, language models, understanding & programming

@typeswitch It's perfectly possible to represent the concept of 2 with just symbols (two dots, vs. just one dot, or whatever).

Likewise English sentences about these don't seem to need any referents that aren't available; understanding them would at a core level boil down to understanding their logical content.

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

**glaebhoerl** @glaebhoerl@schelling.pt · Mar 02, 2023, 17:58

Mar 02, 2023, 17:58