• Last point especially crucial in situations where such an agent starts recursively improving itself (e.g. training new models)
@niplav It seems like GPT-4 based AutoGPT is just too weak of an optimizer to confidently extrapolate bounds? Though, it admittedly should be SOME evidence that a thing that can pass the bar exam is nevertheless basically hopeless when tasked to act as an agent.
The last points especially might be ameliorated by literally just appending "and don't optimize too hard" and "let yourself be shut down by a human" to the prompt?
Man I feel confused, but assuming that language models aren't infested with inner optimizers now I'm more hopeful?
Or am I missing something crucial here…