Skip to content(if available)orjump to list(if available)

GPUPrefixSums – state of the art GPU prefix sum algorithms

m-schuetz

That and https://github.com/b0nes164/GPUSorting have been a tremendous help for me, since CUB does not nicely work with the Cuda Driver Api. The author is doing amazing work.

genpfault

almostgotcaught

this is missing the most important one (in today's world): extracting non-zero elements from a sparse vector/matrix

https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-co...

merope14

Not even close. The most important application (in today's world) is radix sort.

WJW

What specific application do you have in mind that radix sort is more important than matrix multiplication?