"to increase performance by 10% absolute, just take the majority-vote answer of several LM answers"
"to find hyperparams about twice as fast, start a bunch of networks training and after a while copy the weights of the one improving fastest. repeat"
https://www.deepmind.com/blog/population-based-training-of-neural-networks
"to reduce resource use by 50%(!), use a large model to do rejection sampling of small models' output"
https://arxiv.org/abs/2302.01318