[Mesa-dev] r600g: status of my work on the shader optimization

Vadim Girlin vadimgirlin at gmail.com
Sun Feb 24 11:25:18 PST 2013


On 02/24/2013 11:01 PM, Henri Verbeet wrote:
> On 24 February 2013 17:07, Vadim Girlin <vadimgirlin at gmail.com> wrote:
>> If you'd like to help me with debugging the issues on cayman, then please
>> read the "regression debugging" section in the r600-sb branch notes [1] (or
>> possibly more up-to-date source here - [2]), it explains how to find the
>> exact shader that causes regression. After locating the guilty shader, you
>> only need to prepend
>>
>>      R600_DUMP_SHADERS=2 R600_SB_DUMP=3
>>
>> to the command line to produce the full dump for that shader, then please
>> send it to me, and I'll do my best to fix the issue.
>>
> I briefly looked at it yesterday, but ran out of time. I did find out
> that all of the failures except one (in loop_index_test()) go away if
> I skip the if-conversion pass. I have the impression that the
> PRED_SET* instructions like PRED_SETNE_INT don't behave quite the way
> the ISA docs claim they do wrt. the value written to the destination
> register, but instead seem to behave more like the regular SET*
> instructions in that regard. I didn't properly test this, so it's more
> of a theory at this point, and may just be wrong. Of course we don't
> really want to use the PRED_SET* instructions to begin with if we're
> not going to actually use the predicate, so we'll probably just want
> to convert them to SET* instructions anyway.

Yeah, I see now, I missed the fact that PRED_SETcc instructions on 
cayman can't be used as simple ALU instructions without side effects, 
they always update exec_mask.

> Somewhat related to that, wined3d generates GLSL of the form "dst =
> (src0 < src1) ? 1.0 : 0.0;" for the D3D "slt" instruction (and similar
> code for instructions like sge, etc.). The TGSI for that looks pretty
> awful, first doing a SLT, converting the result to an integer, and
> then branching on that to assign either 1.0 or 0.0 again. The
> if-conversion pass is fairly helpful there in the sense that it at
> least gets rid of the branches, but you still end up with a sequence
> like SETGE, FLT_TO_INT, PRED_SETNE_INT, CNDE_INT, while all that's
> really needed is the SETGE. That's probably best addressed in either
> the GLSL compiler or the GLSL -> TGSI stage though.
>

I just implemented the simplest way of if-conversion for now, but the 
first thing that I'm going to add to the (currently empty) peephole pass 
is the optimization of SETcc, PRED_SETcc, etc, including the conversions 
between float/int bools, to get rid of all unnecessary operations.

> Unfortunately I won't be able to test with that system again until at
> least Thursday, so it'll be a while before I can actually do anything
> about it.

Well, I see the problem now, so there is no need for any special testing 
until I'll implement something to fix it.

Anyway, you helped me to understand what's wrong, thanks.

Vadim


More information about the mesa-dev mailing list