Home >> index

This website wants to provide insight into the performance of GPUs used for speeding up GP-GPU algorithms.
The performance analysis is based on microbenchmarks and the pipeline model. Both are academic achievements of our team.


The microbenchmarks (written in OpenCL) can be run with our Java app.


We invite everyone to store the results in our database. The database can be freely consulted.


The runtime of an OpenCL kernel can be estimated based on its number of instructions and the power of the chosen GPU.


The pipeline model can be run with our Java app which is linked to the database.

With the microbenchmarks we measure specific GPU performance characteristics; each microbenchmarks is targeting one specific characterictic.
At first one is interested in the peak computational or memory performance (measured in flops or bytes per second).
The compute intensity (number of instructions per byte read) of an algorithm determines whether the algorithm is compute-bound or memory-bound.
However, often the computational or memory peak performance is not achieved. Such algorithms are often called occupancy-bound or latency-bound algorithms since the amount of concurrency is not large enough to hide all latencies. It results in an efficiency lower than 100%.
To understand this behavior, we are working on a semi-abstract model to understand the interaction between the different threads.
We model a core (compute unit) of a GPU by two pipelines: one for the computational and one for the memory subsystem. Hence the name pipeline model.

6141 Visits