[Mesa-dev] r600g: status of my work on the shader optimization

Tom Stellard tom at stellard.net
Fri Feb 15 06:31:23 PST 2013


On Fri, Feb 15, 2013 at 03:00:24PM +0400, Vadim Girlin wrote:
> On 02/14/2013 02:42 PM, Christian König wrote:
> >Hi Vadim,
> >
> >nice work, I think you've made quite a progress here, but on the other
> >hand it should be clear that the LLVM backend is the future and we
> >should concentrate on that.
> 
> "LLVM backend is the future" is a pretty abstract argument. I prefer
> to operate with real facts. After a year of LLVM backend development
> what are the real benefits for the users? What are the real use
> cases where the users might prefer LLVM backend? To me this
> situation looks like the use of LLVM requires a lot more time and
> development efforts than the custom solution, despite the initial
> expectations. Maybe you are right and the LLVM backend will become
> the best alternative for users sometime in the future, but I only
> have some today's results:
> 
> Heaven 3.0, all settings high/enabled, 1280x720, HD5750:
>   default backend : 20.0 fps
>   llvm backend    : 18.8 fps
>   r600-sb         : 38.0 fps
> 

Hi Vadim,

A month or so ago you wrote an initial machine scheduler implmentation
for the R600 LLVM backend and said it had a big impact on performance.
When you tested the LLVM backend in these tests, did you have that patch
applied?

-Tom


> When I'm looking at these results, the benefits of LLVM-based
> solution are not very clear to me.
> 
> I'm not trying to persuade anyone, just wanted to explain why I
> decided to switch back to work on the non-LLVM solution.
> 
> Anyway, it's absolutely not a problem for me if this branch will
> never make it to mesa, I was ready to this before I started. One of
> the goals of this branch was just to show that the use of LLVM is
> possibly not the the best way of the GL shaders compilation for
> r600g. And another goal, of course, is to get better performance
> with r600g *today*, not in the future.
> 
> Vadim
> 
> >
> >To sum it up I'm not sure what we should do with this branch :)
> >
> >As Dragomir already wrote even if the code won't be used much the
> >know-how you gained while coding it will stay, believe me that this is
> >or far more value than the code itself.
> >
> >Christian.
> >
> >Am 14.02.2013 11:10, schrieb Dragomir Ivanov:
> >>Greetings,
> >>I hope that, even if you work will be short-lived, e.g. until LLVM
> >>bytecode compiler takes off, the know-how is still very useful.
> >>
> >>
> >>On Thu, Feb 14, 2013 at 4:04 AM, Vadim Girlin <vadimgirlin at gmail.com
> >><mailto:vadimgirlin at gmail.com>> wrote:
> >>
> >>    Hi,
> >>
> >>    Last month I finally found the time to work on the rewrite of my
> >>    previous shader optimization branch, now it's mostly done in terms
> >>    of the correctness of produced code and feature support (at least
> >>    on evergreen), though it's still a work in progress in terms of
> >>    the efficiency of generated shader code and the efficiency of the
> >>    backend itself.
> >>
> >>    I spent some time last year studying the LLVM infrastructure and
> >>    R600 LLVM backend and trying to improve it, but after all I came
> >>    to the conclusion that for me it might be easier to implement all
> >>    that I wanted in the custom backend. This allows for more simple
> >>    and efficient implementation - e.g. I don't have to deal with CFGs
> >>    because in fact we have structured code, so it's possible to use
> >>    more simple and efficient algorithms.
> >>
> >>    Currently the branch has no regressions with piglit's
> >>    quick-driver.tests on evergreen (it doesn't rely on the fallback
> >>    to unoptimized code for the shaders with relative addressing and
> >>    other cases unlike the previous branch), and so far I don't see
> >>    any rendering issues with the apps that I used for testing -
> >>     Lightsmark 2008, Unigine Heaven 3.0 and some others.. There are
> >>    also some performance improvements with the gpu-bound apps.
> >>
> >>    I tried to keep in mind the differences between chip classes, so I
> >>    hope it should only require minor fixes to make it work on
> >>    non-evergreen chips, but I doubt that it will work out of the box
> >>    - support for some non-evergreen hw-specific features is still
> >>    missing, e.g. I'm sure that indirect addressing currently won't
> >>    work on R6xx, though basic tests might work in theory. Fixing this
> >>    shouldn't require a lot of work though.
> >>
> >>    The branch can be found in my freedesktop repo:
> >>
> >>    http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-sb
> >>    <http://cgit.freedesktop.org/%7Evadimg/mesa/log/?h=r600-sb>
> >>
> >>    Regarding the differences from the previous branch - there are
> >>    some additional optimizations, e.g. global value numbering with
> >>    some basic support for constant folding (not all instructions are
> >>    currently handled, but it's easy to extend), global code motion
> >>    that can hoist invariant code out of the loops etc. Some
> >>    optimizations that were implemented in the previous branch are not
> >>    implemented in the new branch (yet), e.g. propagation of modifiers
> >>    (I'm not even sure if it has any noticeable effect on performance).
> >>
> >>    Unlike the previous branch, there is support for indirect
> >>    addressing on registers -  currently it uses my previously posted
> >>    patch (that was not very welcome) for obtaining the  information
> >>    about addressable register ranges, but it's not required and can
> >>    be dropped, I just used that patch for testing. Without that
> >>    information opportunities for optimization are limited though, and
> >>    perhaps it makes sense to not try to optimize the shaders with
> >>    indirect gpr addressing at all and rely on the old backend until
> >>    we'll have the proper solution to pass that information to the
> >>    drivers.
> >>
> >>    There is also initial support for ALU predication, but it's not
> >>    complete and currently unused, I'm not sure if predication support
> >>    will have significant effect on performance that will justify more
> >>    complex and expensive algorithms for register allocator and
> >>    scheduler, probably I'll look into it later, I consider this as a
> >>    low priority. In the case of predicated source code (from LLVM
> >>    backend) the predication is eliminated using speculative execution
> >>    and conditional moves, same as with the simple if-conversion pass
> >>    that is also implemented.
> >>
> >>    The branch currently uses as source the bytecode built by the old
> >>    backend (that may also come from LLVM backend) and some additional
> >>    information (about inputs etc), final bytecode is built by the new
> >>    builder in the branch. Building two versions of the bytecode
> >>    doesn't look very efficient, but currently it simplifies
> >>    debugging. I'm planning to implement translation from TGSI
> >>    directly to my representation, it should simplify the translator
> >>    and allow to get rid of unnecessary intermediate passes.
> >>
> >>    Some old and new environment variables can be used to control the
> >>    behavior of this backend:
> >>
> >>    R600_SB - 0 - disable new backend completely, 1 - enable (default)
> >>    R600_SB_USE_NEW_BYTECODE - 0 - disable use of the produced
> >>    bytecode (useful if you only want to look at the dump of the
> >>    optimized shader without passing it to hw), 1 - enable (default)
> >>    R600_DUMP_SHADERS - will also dump the dissasemble of the
> >>    optimized shader after original bytecode (if backend is not
> >>    disabled with R600_SB=0).
> >>
> >>    Produced shader code is not ideal - e.g. you may notice not very
> >>    necessary MOVs inserted before DOT4 instructions, it's a known
> >>    issue and I'm going to look into it - this may require rework of
> >>    the regalloc/scheduler. I had to sacrifice some features to make
> >>    it work correctly with Heaven first, so that now I can try to
> >>    improve it while being able to test for regressions.
> >>
> >>    Also probably there are some issues with the cleanness of the code
> >>    - I had to rework some parts a few times while fixing all
> >>    problems, so there is possibly unused code and other remnants of
> >>    the previous versions. Anyway, I still consider it as a work in
> >>    progress and some things are going to be reworked.
> >>
> >>    I'm not sure what will be the destiny of this branch, taking into
> >>    account that we also have actively developed LLVM backend that is
> >>    required for OpenCL anyway. Your opinions are welcome.
> >>
> >>    Vadim
> >>    _______________________________________________
> >>    mesa-dev mailing list
> >>    mesa-dev at lists.freedesktop.org
> >><mailto:mesa-dev at lists.freedesktop.org>
> >>    http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >>
> >>
> >>
> >>
> >>_______________________________________________
> >>mesa-dev mailing list
> >>mesa-dev at lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
> 
> 
> 
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list