If you take a modern C++ or Rust compiler and think about optimizing it…
There’s lots of “low-hanging fruit” in the form of incrementality and parallelism.
Not truly “low-hanging” as in easy to implement; it’s actually extremely hard. But it’s easy to theorize about. It’s known to be *possible*, and capable of massive speedups in various cases.
But what about the rest? How much room is there for large speedups just by optimizing algorithms? To me that feels like much more of an unknown.
@regehr @comex I'd guess you've probably encountered this? But if not: https://github.com/Co-dfns/Co-dfns https://scholar.google.com/scholar?hl=en&as_sdt=0%2C26&q=%22A+data+parallel+compiler+hosted+on+the+gpu%22+Aaron+Hsu&btnG=
@raph @glaebhoerl @regehr That ('Compilation on the GPU?') is awesome. Of course, such a simple compiler would also be extremely fast on the CPU, and the kinds of language rules that make modern language compilers slow would be… a lot harder to parallelize. But even doing that much on the GPU is really cool.
@glaebhoerl @comex don’t think so but will take a look
@glaebhoerl @regehr @comex Also see Compilation On The GPU? A Feasibility Study[1], which is in some ways more conventional than co-dfns (it's a C-like language) and also in some ways more ambitious (the parsing handles arbitrarily nested depth without performance compromise).
[1]: https://dl.acm.org/doi/pdf/10.1145/3528416.3530249