[Mesa-dev] r600g: status of my work on the shader optimization
vadimgirlin at gmail.com
Mon Feb 25 05:45:39 PST 2013
On 02/24/2013 11:25 PM, Vadim Girlin wrote:
> On 02/24/2013 11:01 PM, Henri Verbeet wrote:
>> On 24 February 2013 17:07, Vadim Girlin <vadimgirlin at gmail.com> wrote:
>>> If you'd like to help me with debugging the issues on cayman, then
>>> read the "regression debugging" section in the r600-sb branch notes
>>>  (or
>>> possibly more up-to-date source here - ), it explains how to find the
>>> exact shader that causes regression. After locating the guilty
>>> shader, you
>>> only need to prepend
>>> R600_DUMP_SHADERS=2 R600_SB_DUMP=3
>>> to the command line to produce the full dump for that shader, then
>>> send it to me, and I'll do my best to fix the issue.
>> I briefly looked at it yesterday, but ran out of time. I did find out
>> that all of the failures except one (in loop_index_test()) go away if
>> I skip the if-conversion pass. I have the impression that the
>> PRED_SET* instructions like PRED_SETNE_INT don't behave quite the way
>> the ISA docs claim they do wrt. the value written to the destination
>> register, but instead seem to behave more like the regular SET*
>> instructions in that regard. I didn't properly test this, so it's more
>> of a theory at this point, and may just be wrong. Of course we don't
>> really want to use the PRED_SET* instructions to begin with if we're
>> not going to actually use the predicate, so we'll probably just want
>> to convert them to SET* instructions anyway.
> Yeah, I see now, I missed the fact that PRED_SETcc instructions on
> cayman can't be used as simple ALU instructions without side effects,
> they always update exec_mask.
Well, it seems I misunderstood the docs, in fact PRED_SET instructions
can be used when UPDATE_EXEC_MASK=0. I just thought that
ALU_WORD1_OP2_EXECUTE_MASK encoding is always used with them, and in the
description of this encoding value 0 for UPDATE_EXEC_MASK is marked as
"reserved", so I thought that they should always have UPDATE_EXEC_MASK=1.
Now I think you are right and probably the cause is that the value
written to dst register is not what I expect. Anyway, I think the best
solution is still the same as you mentioned - get rid of PRED_SETcc
instructions when the update of exec mask or predicate is not required.
>> Somewhat related to that, wined3d generates GLSL of the form "dst =
>> (src0 < src1) ? 1.0 : 0.0;" for the D3D "slt" instruction (and similar
>> code for instructions like sge, etc.). The TGSI for that looks pretty
>> awful, first doing a SLT, converting the result to an integer, and
>> then branching on that to assign either 1.0 or 0.0 again. The
>> if-conversion pass is fairly helpful there in the sense that it at
>> least gets rid of the branches, but you still end up with a sequence
>> like SETGE, FLT_TO_INT, PRED_SETNE_INT, CNDE_INT, while all that's
>> really needed is the SETGE. That's probably best addressed in either
>> the GLSL compiler or the GLSL -> TGSI stage though.
> I just implemented the simplest way of if-conversion for now, but the
> first thing that I'm going to add to the (currently empty) peephole pass
> is the optimization of SETcc, PRED_SETcc, etc, including the conversions
> between float/int bools, to get rid of all unnecessary operations.
>> Unfortunately I won't be able to test with that system again until at
>> least Thursday, so it'll be a while before I can actually do anything
>> about it.
> Well, I see the problem now, so there is no need for any special testing
> until I'll implement something to fix it.
> Anyway, you helped me to understand what's wrong, thanks.
More information about the mesa-dev