This makes me want to instrument all division operations in a bunch of int heavy programs and figure out what the average / median / distribution of the quotients are.
---
RT @jperldev
Integer division is exceptionally slow on a GPU, so I've written a single-instruction, O(1) complexity integer division algorithm in CUDA. Please feel free to use it (with attribution).
https://twitter.com/jperldev/status/1639707857624141824