[Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

Marek Olšák maraeo at gmail.com
Tue Oct 18 16:10:58 UTC 2016


On Tue, Oct 18, 2016 at 3:55 PM, Eero Tamminen
<eero.t.tamminen at intel.com> wrote:
> Hi,
>
> On 18.10.2016 16:25, Jan Ziak wrote:
>>
>> On Tue, Oct 18, 2016 at 3:12 PM, Nicolai Hähnle <nhaehnle at gmail.com>
>> wrote:
>>>
>>> On 18.10.2016 15:07, Jan Ziak wrote:
>>>>
>>>> On Tue Oct 18 09:29:59 UTC 2016, Eero Tamminen wrote:
>>>>>
>>>>> On 18.10.2016 01:07, Jan Ziak wrote:
>>>>>>
>>>>>> - The total number of executed instructions goes down from 64.184 to
>>>>>> 63.797
>>>>>>   giga-instructions when Mesa is compiled with "gcc -O0 ..."
>>>>>
>>>>>
>>>>> Please don't do performance related decisions based on data from
>>>>> compiling code with optimizations disabled.  Use -O2 or -O3 (or even
>>>>> better, check both).
>>>>
>>>>
>>>> Options -O2 and -O3 interfere with profiling tools.
>>>>
>>>> I will try using -Og the next time.
>>>
>>>
>>> Just stop and use proper profiling tools like perf that can work with
>>> optimized tools.
>
>
> Valgrind/callgrind/cachegrind works also fine with optimized binaries.
>
> All profiling tools lie, at least a bit. It's better to know their strengths
> and weaknesses so that one knows which ones complement each other. Perf is
> e.g. good at finding hotspots, Valgrind (callgrind) is more reliable in
> telling how they get called.
>
> One may also needs GCC version from this decade.  Really old GCC versions
> didn't inlude all debug info needed for debugging optimized binaries.

Regarding C++ templates, the compiler doesn't use them. If u_vector
(Dave Airlie?) provides the same functionality as your array, I
suggest we use u_vector instead. If you can't use u_vector, you should
ask for approval from GLSL compiler leads (e.g. Ian Romanick or
Kenneth Graunke) to use C++ templates.


I'll repeat some stuff about profiling here but also explain my perspective.

Never profile with -O0 or disabled function inlining. Mesa uses -g -O2
with --enable-debug, so that's what you should use too. Don't use any
other -O* variants.

The only profiling tools reporting correct results are perf and
sysprof. (both use the same mechanism) If you don't enable dwarf in
perf (also sysprof can't use dwarf), you have to build Mesa with
-fno-omit-frame-pointer to see call trees. The only reason you would
want to enable dwarf-based call trees is when you want to see libc
calls. Otherwise, they won't be displayed or counted as part of call
trees. For Mesa developers who do profiling often,
-fno-omit-frame-pointer should be your default.

Callgrind counts calls (that one you can trust), but the reported time
is incorrect, because it uses its own virtual model of a CPU. Avoid it
if you want to measure time spent in functions.

Marek


More information about the mesa-dev mailing list