Home >> index

This website wants to provide insight into the performance of GPUs used for speeding up GP-GPU algorithms.


The microbenchmarks (written in OpenCL) can be run with our Java app.


We invite everyone to store the results in our database. The database can be freely consulted.


The runtime of an OpenCL kernel can be estimated based on its number of instructions and the power of the chosen GPU.

With the microbenchmarks we measure specific GPU performance characteristics; each microbenchmarks is targeting one specific characterictic.
At first one is interested in the peak computational or memory performance (measured in flops or bytes per second).
The compute intensity (number of instructions per byte read) of an algorithm determines whether the algorithm is compute-bound or memory-bound.
However, often the computational or memory peak performance is not achieved. Such algorithms are often called occupancy-bound or latency-bound algorithms since the amount of concurrency is not large enough to hide all latencies. It results in an efficiency lower than 100%.

23166 Visits