[Mesa-dev] r600g: status of the r600-sb branch
Vadim Girlin
vadimgirlin at gmail.com
Fri Apr 19 07:48:43 PDT 2013
Hi,
In the previous status update I said that the r600-sb branch is not
ready to be merged yet, but recently I've done some cleanups and
reworks, and though I haven't finished everything that I planned
initially, I think now it's in a better state and may be considered for
merging.
I'm interested to know if the people think that merging of the r600-sb
branch makes sense at all. I'll try to explain here why it makes sense
to me.
Although I understand that the development of llvm backend is a primary
goal for the r600g developers, it's a complicated process and may
require quite some time to achieve good results regarding the
shader/compiler performance, and at the same time this branch already
works and provides good results in many cases. That's why I think it
makes sense to merge this branch as a non-default backend at least as a
temporary solution for shader performance problems. We can always get
rid of it if it becomes too much a maintenance burden or when llvm
backend catches up in terms of shader performance and compilation
speed/overhead.
Regarding the support and maintenance of this code, I'll try to do my
best to fix possible issues, and so far there are no known unfixed
issues. I tested it with many apps on evergreen and fixed all issues
with other chips that were reported to me on the list or privately after
the last status announce. There are no piglit regressions on evergreen
when this branch is used with both default and llvm backends.
This code was intentionally separated as much as possible from the other
parts of the driver, basically there are just two functions used from
r600g, and the shader code is passed to/from r600-sb as a hardware
bytecode that is not going to change. I think it won't require any
modifications at all to keep it in sync with the most changes in r600g.
Some work might be required though if we'll want to add support for the
new hw features that are currently unused, e.g. geometry shaders, new
instruction types for compute shaders, etc, but I think I'll be able to
catch up when it's implemented in the driver and default or llvm
backend. E.g. this branch already works for me on evergreen with some
simple OpenCL kernels, including bfgminer where it increases performance
of the kernel compiled with llvm backend by more than 20% for me.
Besides the performance benefits, I think that alternative backend also
might help with debugging of the default or llvm backend, in some cases
it helped me by exposing the bugs that are not very obvious otherwise,
e.g. it may be hard to compare the dumps from default and llvm backend
to spot the regression because they are too different, but after
processing both shaders with r600-sb the code is usually transformed to
some more common form, and often this makes it easier to compare and
find the differences in shader logic.
One additional feature that might help with llvm backend debugging is
the disassembler that works on the hardware bytecode instead of the
internal r600g bytecode structs. This results in the more readable
shader dumps for instructions passed in native hw encoding from llvm
backend. I think this also can help to catch more potential bugs related
to bytecode building in r600g/llvm. Currently r600-sb uses its bytecode
disassembler for all shader dumps, including the fetch shaders, even
when optimization is not enabled. Basically it can replace
r600_bytecode_disasm and related code completely.
Below are some quick benchmarks for shader performance and compilation
time, to demonstrate that currently r600-sb might provide better
performance for users, at least in some cases.
As an example of the shaders with good optimization opportunities I used
the application that computes and renders atmospheric scattering
effects, it was mentioned in the previous thread:
http://lists.freedesktop.org/archives/mesa-dev/2013-February/034682.html
Here are current results for that app (Main.noprecompute, frames per
second) with default backend, default backend + r600-sb, and llvm backend:
def def+sb llvm
240 590 248
Another quick benchmark is an OpenCL kernel performance with bfgminer
(megahash/s):
llvm llvm+sb
68 87
One more benchmark is for compilation speed/overhead - I used two piglit
tests, first compiles a lot of shaders (IIRC more than thousand), second
compiles a few huge shaders. Result is a test run time in seconds, this
includes not only the compilation time but anyway shows the difference:
def def+sb llvm
tfb max-varyings 10 14 53
fp-long-alu 0.17 0.38 0.68
This is especially important for GL apps, because longer compilation
time results in the more significant freezes in the games etc. As for
the quality of the compiled code in this test, of course generally llvm
backend is already able to produce better code in some cases, but e.g.
for the longest shader from the fp-long-alu test both backends optimize
it to the two alu instructions.
Of course this branch won't magically make all applications faster, many
older apps are not really limited by the shader performance at all, but
I think it might improve performance for many relatively modern
applications/engines, e.g. for the applications based on the Unigine and
Source engines.
The branch itself can be found here:
http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-sb
You might prefer to browse new files in a tree instead of reading a huge
patch:
http://cgit.freedesktop.org/~vadimg/mesa/tree/src/gallium/drivers/r600/sb?h=r600-sb
If you'd like to test it, currently the optimization for GL shaders is
enabled by default, can be disabled with R600_SB=0. Optimization for
compute shaders is not enabled by default because it's still very
limited and experimental, can be enabled with R600_SB_CL=1. Disassemble
of the optimized shaders is printed with R600_DUMP_SHADERS=2.
If you think that merging of the branch makes sense, any
comments/suggestions about what is required to prepare the branch for
merging are welcome.
Vadim
More information about the mesa-dev
mailing list