[Mesa-dev] shader-db, and justifying an i965 compiler optimization.
idr at freedesktop.org
Wed May 18 00:00:09 PDT 2011
-----BEGIN PGP SIGNED MESSAGE-----
On 05/18/2011 05:22 AM, Eric Anholt wrote:
> One of the pain points of working on compiler optimizations has been
> justifying them -- sometimes I come up with something I think is
> useful and spend a day or two on it, but the value doesn't show up as
> fps in the application that suggested the optimization to me. Then I
> wonder if this transformation of the code is paying off in general,
> and thus if I should push it. If I don't push it, I end up bringing
> that patch out on every application I look at that it could affect, to
> see if now I finally have justification to get it out of a private
> At a conference this week, we heard about how another team is are
> using a database of (assembly) shaders, which they run through their
> compiler and count resulting instructions for testing purposes. This
> sounded like a fun idea, so I threw one together. Patch #1 is good in
This is one of those ideas that seems so obvious after you hear about it
that you can't believe you hadn't thought of it years ago. This seems
like something we'd want in piglit, but I'm not sure how that would look.
The first problem is, obviously, using INTEL_DEBUG=wm to get the
instruction counts won't work. :) Perhaps we could extend some of the
existing assembly program queries (e.g.,
GL_PROGRAM_NATIVE_INSTRUCTIONS_ARB) to GLSL. That would help even if we
didn't incorporate this into piglit.
The other problem is what the test would report for a result. Hmm...
> general (hey, link errors, finally!), but also means that a quick hack
> to glslparsertest makes it link a passing compile shader and therefore
> generate assembly that gets dumped under INTEL_DEBUG=wm. Patch #2 I
> used for automatic scraping of shaders in every application I could
> find on my system at the time. The open-source ones I pushed to:
> And finally, patch #3 is something I built before but couldn't really
> justify until now. However, given that it reduced fragment shader
> instructions 0.3% across 831 shaders (affecting 52 of them including
> yofrankie, warsow, norsetto, and gstreamer) and didn't increase
> instructions anywhere, I'm a lot happier now.
We'll probably want to be able to disable this once we have some sort of
CSE on the low-level IR. This sort of optimization can cause problems
for CSE in cases where the same register is a source and a destination.
Imagine something like
z = sqrt(x) + y;
z = z * w;
q = sqrt(x) + y;
If the result of the first 'sqrt(x) + y' is written directly to z, the
value is "gone" when the second 'sqrt(x) + y' is executed. If that
result is written to a temporary register that is then copied to z, the
value is still around at the second instance.
Since we don't have any CSE, this doesn't matter now. However, it's
something to keep in mind.
> Hopefully we hook up EXT_timer_query to apitrace soon so I can do more
> targeted optimizations and need this less :) In the meantime, I hope
> this can prove useful to others -- if you want to contribute
> appropriately-licensed shaders to the database so we track those, or
> if you want to make the analysis work on your hardware backend, feel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----
More information about the mesa-dev