Slow AF rn, soon will be fast to an acceptable level as we add more and more gemm capable hardware to phones.
---
RT @nonmayorpete
That is, until recently. Enter the LLaMA experiments.
Meta released a family of GPT-3 quality models to researchers starting end of February.
Since then, it's been leaked. And people have been pushing to get them running on smaller and smaller devices.
A quick runthrough:
https://twitter.com/nonmayorpete/status/1640443506299908096