[Mesa-dev] Require micro-benchmarks for performance optimization oriented patches

Fri Nov 21 02:51:57 PST 2014

On Thu, 2014-11-20 at 18:46 +0200, Eero Tamminen wrote:
> Hi,
> 
>  > Honestly, I think I'm okay with our usual metrics like:
>  > - Increased FPS in a game or benchmark
>  > - Reduced number of instructions or memory accesses in
>      a shader program
>  > - Reduced memory consumption
>  > - Significant cycle reduction in callgrind or better generated code
>  >   (ideally if it's a lot of code I'd like better justification)
> 
> Profiling tools like callgrind are means for analyzing, not for
> measuring.
> 
> The problem with profiler data is that the cost may have just
> been moved elsewhere, *and* grown:
> 
> * Kcachegrind visualization for valgrind/callgrind data shows call
>    counts and relative performance.  If relative cost of a given
>    function has decreased, that still doesn't tell anything about:
>    - its absolute cost, i.e.
>    - whether cost just moved somewhere else instead of total cost
>      really decreasing

Sure but Kcachegrind can group that relative performance into libraries
so as long as you don't move the cost into another library (e.g external
library calls) you can get an idea of if the function has really
improved.

> 
> * Callgrind reports instruction counts, not cycles.  While
>    they're a good indicator, it doesn't necessarily tell about
>    real performance (instruction count e.g. doesn't take into
>    account data or instruction cache misses)
> 
> * Valgrind tracks only CPU utilization on the user-space.
>    It doesn't notice increased CPU utilization at kernel side.
> 
> * Valgrind tracks only single process, it doesn't notice
>    increased CPU utilization in other processes (in graphics
>    perf, X server side is sometimes relevant)
> 
> * Valgrind doesn't track GPU utilization.   Change may have
>    moved more load there.
> 
> -> Looking just at Callgrind data is NOT enough, there must
> also be some real measurement data.
> 

Sure there are no perfect tools for profiling, a similar list could be
created for the downfalls of using perf/oprofile.

I don't think anyone is suggesting Callgrind data should be blindly
accepted, but isn't that also the point of code review to be able to use
knowledge to decide if a change seems reasonable. This includes looking
at the generated code.

> 
> As to measurements...
> 
> Even large performance improvements fail to show up with
> the wrong benchmark.  One needs to know whether test-case
> performance is bound by what you were trying to optimize.
> 
> If there's no such case known, simplest may be just to write
> a micro-benchmark for the change (preferably two, one for best
> and one for the worst case).

I did some quick googling looks like there has been a couple of attempts
at an OpenGL open source benchmarking suite in the past [1][2] but
neither got very far. Maybe this would make a good student project to at
least setup the infrastructure to be built upon.

[1] http://globs.sourceforge.net/
[2] https://code.google.com/p/freedmark/

> 
> 
>      - Eero
> 
> PS. while analyzing memory usage isn't harder, measuring that
> is, because memory usage can be shared (both in user-space and
> on kernel buffers), and there's a large impact difference on
> whether memory is clean or dirty (unless your problem is running
> out of 32-bit address space when clean memory is as much of a problem).
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev