[Mesa-dev] [PATCH] glsl: optimize list handling in opt_dead_code

Fri Oct 28 07:03:34 UTC 2016

I would like to mention that the response Ian posted was/is a nice read.

I am not sure what to write in response. Let me just note that adding
formalisms to Mesa would lead to both higher performance and higher
safety. If you do not want C++ in Mesa, maybe it would be acceptable
for Mesa developers to generate C code by processing a higher-level
description of OpenGL calls and state transitions.

https://en.wikipedia.org/wiki/Formalism_(philosophy_of_mathematics)
https://en.wikipedia.org/wiki/Automatic_programming#Generative_programming

Jan

On Wed, Oct 19, 2016 at 12:21 AM, Ian Romanick <idr at freedesktop.org> wrote:
> On 10/18/2016 10:12 AM, Jan Ziak wrote:
>>> Regarding C++ templates, the compiler doesn't use them. If u_vector
>>> (Dave Airlie?) provides the same functionality as your array, I
>>> suggest we use u_vector instead.
>>
>> Let me repeat what you just wrote, because it is unbelievable: You are
>> advising the use of non-templated collection types in C++ code.
>
> Are you able to find any templates anywhere in the GLSL compiler?  I
> don't think his statement was ambiguous.
>
>>> If you can't use u_vector, you should
>>> ask for approval from GLSL compiler leads (e.g. Ian Romanick or
>>> Kenneth Graunke) to use C++ templates.
>>
>> - You are talking about coding rules some Mesa developers agreed upon
>> and didn't bother writing down for other developers to read
>
> It was mostly written down, but it's not documented in the code base.
> It seems impossible to even get current, de facto practices documented.
> It's one of the few things in Mesa that really does get bike shedded.
>
> Before the current GLSL compiler, there was no C++ in Mesa at all.
> While developing the compiler, I found that I was re-implementing
> numerous C++ features by hand in C.  It felt pretty insane.  Why am I
> filling out all of these virtual function tables by hand?
>
> At the same time, I also observed that almost 100% of shipping,
> production-quality compilers were implemented using C++.  The single
> exception was GCC.  The need for GCC to bootstrap on minimal, sometimes
> dire, C compilers was the one thing keeping C++ out of the GCC code
> base.  It wasn't even that long ago that core parts of GCC had to
> support pre-C89 compilers.  As far as I am aware, they have since
> started using C++ too.  Who am I to be so bold as to declare that
> everyone shipping a C compiler is wrong?
>
> In light of that, I opened a discussion about using C++ in the compiler.
>
> Especially at that time (2008-ish), nobody working on Mesa was
> particularly skilled at C++.  I had used it some, and, in the mid-90's,
> had some really, really bad experiences with the implementations and
> side-effects of various language features.  I still have nightmares
> about trying to use templates in GCC 2.4.2.  There are quite a few C++
> features that are really easy to misuse.  There are also a lot of
> subtleties in the language that very few people really understand.
>
> I don't mean this in a pejorative way, but there was and continues to be
> a lot of FUD around C++.  I think a lot of this comes from the "Old
> Woman Who Swallowed a Fly" nature of solving C++ development problems.
> You have a problem.  The only way to solve that problem is to use
> another language feature that you may or may not understand how to use
> safely.  You use that feature to solve your problem.  Use of that
> feature presents a new problem.  The only way to solve the new problem
> is to use yet another language feature that you may or may not
> understand how to use safely.  Pretty soon nobody knows how anything in
> the code works.
>
> After quite a bit of discussion on the mesa-dev list, on #dri-devel, and
> face-to-face at XDC, we decided to use C++ with some restrictions.  The
> main restriction was that C++ would be limited to the GLSL compiler
> stack.  The other restrictions were roughly similar to the embedded C++
> subset.
>
>     - No exceptions.
>
>     - No RTTI.
>
>     - No multiple inheritance.
>
>     - No operator overloading.  It could be argued that our use of
>       placement new deviates from this.  In the previous metaphor, I
>       think this was either the spider or the bird.
>
>     - No templates.
>
> There are other restrictions (e.g., no STL) that come as natural
> consequences of these.
>
> Our goal was that any existing Mesa developer should be able to read any
> piece of new C++ code and know what it was doing.
>
> I feel like, due to our collective ignorance about the language, we may
> have been slightly too restrictive.  It seems like we could have used
> templates in some very, very restricted ways to enable things like
> iterators that would have saved typing, encouraged refactoring, and made
> the code more understandable.  Instead we have a proliferation of
> foreach macros (or callbacks), and every data structure is a linked
> list.  It's difficult to say whether it would have made things strictly
> better or led us to swallow a bird, a cat, a dog...
>
> I also feel like that ship has sailed.  When NIR was implemented using
> pure C, going so far as to re-invent constructors using macros, the
> chances of using more C++ faded substantially.  If, and that's a really,
> really big if, additional C++ were to be used, it would have to be
> preceded by patches to docs/devinfo.html that documented:
>
>     - What features were to be used.
>
>     - Why use of those features benefit the code base.  Specifically,
>       why use of the new feature is substantially better than a
>       different implementation that does not use the feature.
>
>     - Any restrictions on the use of those features.
>
> Such a discussion may produce additional alternatives.
>
>> - I am not willing to use u_vector in C++ code
>
> Here's the thing... Mesa is a big code base.  Maintenance is a big deal.
>  Fixing bugs, refactoring code, and tuning performance account for most
> of the time people spend working on Mesa.  If a tool exists that fits a
> need, it should be used.  Having multiple implementations of similar,
> basic functionality is a hassle for everyone involved.  A lot of work
> has been done over the last couple years to move things up into
> src/util.  Duplicate implementations of hash tables, sets, math
> functions, and other things have all been reduced.  This is a good
> trend, and it should continue.
>
> I've never seen any of the u_vector code or interfaces, but here is what
> I know.  If you re-invent u_vector now with a single user, it just means
> that someone will come along and refactor your code to use it later.  Is
> fast_list really substantially better than the thing that already has
> users?  If it is, why can those ideas not be applied to u_vector to make
> it better for the existing users?
>
>>> I'll repeat some stuff about profiling here but also explain my perspective.
>>
>> So far (which may be a year or so), there is no indication that you
>> are better at optimizing code than me.
>>
>>> Never profile with -O0 or disabled function inlining.
>>
>> Seriously?
>
> If you don't let the compiler do it's job, you can really only measure
> the O() of your algorithm.  The data about malloc and free calls is
> useful.  You can't really draw many conclusions about the real
> performance with -O0.  It's pretty fundamental to the process.
>
>>> Mesa uses -g -O2
>>> with --enable-debug, so that's what you should use too. Don't use any
>>> other -O* variants.
>
> At least at one time Fedora built with
>
>     -O2 -g -pipe -fstack-protector-strong --param=ssp-buffer-size=4 \
>     -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -grecord-gcc-switches \
>     -m64 -mtune=generic
>
> or
>
>     -O2 -g -pipe -fstack-protector-strong --param=ssp-buffer-size=4 \
>     -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -grecord-gcc-switches \
>     -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables
>
> Those are what I use for performance testing.
>
>> What if I find a case where -O2 prevents me from easily seeing
>> information necessary to optimize the source code?
>
> Then you have rediscovered the Heisenberg uncertainty principle.  It's
> one of the things that makes real performance work really hard.  By and
> large, the difficult performance problems require you to infer things
> from various bits of collected data rather than directly observing them.
>
>>> The only profiling tools reporting correct results are perf and
>>> sysprof.
>>
>> I used perf on Metro 2033 Redux and saw do_dead_code() there. Then I
>> used callgrind to see some more code.
>>
>>> (both use the same mechanism) If you don't enable dwarf in
>>> perf (also sysprof can't use dwarf), you have to build Mesa with
>>> -fno-omit-frame-pointer to see call trees. The only reason you would
>>> want to enable dwarf-based call trees is when you want to see libc
>>> calls. Otherwise, they won't be displayed or counted as part of call
>>> trees. For Mesa developers who do profiling often,
>>> -fno-omit-frame-pointer should be your default.
>>
>>> Callgrind counts calls (that one you can trust), but the reported time
>>> is incorrect,
>>
>> Are you nuts? You cannot be seriously be assuming that I didn't know about that.
>>
>>> because it uses its own virtual model of a CPU. Avoid it
>>> if you want to measure time spent in functions.
>>
>> I will *NOT* avoid callgrind because I know how to use it to optimize code.
>>
>>> Marek
>>
>> As usual, I would like to notify reviewers&mergers of this path that I
>> am not willing to wait months to learn whether the code will be merged
>> or rejected.
>>
>> If it isn't merged by Thursday (2016-oct-20) I will mark it as
>> rejected (rejected based on personal rather than scientific grounds).
>>
>> Jan
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>