**Eigil Rischel - abstr/acc** @ayegill@schelling.pt · Jan 23, 2023

**Eigil Rischel - abstr/acc** @ayegill@schelling.pt · Jan 23, 2023

Eigil Rischel - abstr/acc @ayegill@schelling.pt

Jan 23, 2023

Eigil Rischel - abstr/acc @ayegill@schelling.pt

Does anyone use "reinforcement learning from compiler feedback" to train LLMs for code generation? It seems like eg codex was just made by doing extra training on github code.

**Eigil Rischel - abstr/acc** @ayegill@schelling.pt · 2023-01-23T19:08:32Z

Eigil Rischel - abstr/acc @ayegill@schelling.pt

What I'm imagining is your generate a bunch of completions and penalize the ones that don't compile(or pass some other statically-performable check like generating no syntax errors or whatever)

January 23, 2023 at 7:08 PM · · Mastodon Twitter Crossposter · · ·

Trending now

Resources

Developers

What is Mastodon?

schelling.pt

More…