This website wants to provide insight into the performance of GPUs
used for speeding up GP-GPU algorithms.
The performance analysis is based on microbenchmarks and the pipeline model. Both are academic achievements of our team.
With the microbenchmarks we measure specific GPU performance characteristics; each microbenchmarks is targeting one specific characterictic.
At first one is interested in the peak computational or memory performance (measured in flops or bytes per second).
The compute intensity (number of instructions per byte read) of an algorithm determines whether the algorithm is compute-bound or memory-bound.
However, often the computational or memory peak performance is not achieved. Such algorithms are often called occupancy-bound or latency-bound algorithms since the amount of concurrency is not large enough to hide all latencies. It results in an efficiency lower than 100%.
To understand this behavior, we are working on a semi-abstract model to understand the interaction between the different threads.
We model a core (compute unit) of a GPU by two pipelines: one for the computational and one for the memory subsystem. Hence the name pipeline model.