[Mesa-dev] r600g: status of my work on the shader optimization
Vadim Girlin
vadimgirlin at gmail.com
Tue Feb 26 06:05:37 PST 2013
On 02/24/2013 11:01 PM, Henri Verbeet wrote:
> On 24 February 2013 17:07, Vadim Girlin <vadimgirlin at gmail.com> wrote:
>> If you'd like to help me with debugging the issues on cayman, then please
>> read the "regression debugging" section in the r600-sb branch notes [1] (or
>> possibly more up-to-date source here - [2]), it explains how to find the
>> exact shader that causes regression. After locating the guilty shader, you
>> only need to prepend
>>
>> R600_DUMP_SHADERS=2 R600_SB_DUMP=3
>>
>> to the command line to produce the full dump for that shader, then please
>> send it to me, and I'll do my best to fix the issue.
>>
> I briefly looked at it yesterday, but ran out of time. I did find out
> that all of the failures except one (in loop_index_test()) go away if
> I skip the if-conversion pass. I have the impression that the
> PRED_SET* instructions like PRED_SETNE_INT don't behave quite the way
> the ISA docs claim they do wrt. the value written to the destination
> register, but instead seem to behave more like the regular SET*
> instructions in that regard. I didn't properly test this, so it's more
> of a theory at this point, and may just be wrong. Of course we don't
> really want to use the PRED_SET* instructions to begin with if we're
> not going to actually use the predicate, so we'll probably just want
> to convert them to SET* instructions anyway.
>
I've pushed some changes, it doesn't use the result of PRED_SET anymore.
> Somewhat related to that, wined3d generates GLSL of the form "dst =
> (src0 < src1) ? 1.0 : 0.0;" for the D3D "slt" instruction (and similar
> code for instructions like sge, etc.). The TGSI for that looks pretty
> awful, first doing a SLT, converting the result to an integer, and
> then branching on that to assign either 1.0 or 0.0 again. The
> if-conversion pass is fairly helpful there in the sense that it at
> least gets rid of the branches, but you still end up with a sequence
> like SETGE, FLT_TO_INT, PRED_SETNE_INT, CNDE_INT, while all that's
> really needed is the SETGE. That's probably best addressed in either
> the GLSL compiler or the GLSL -> TGSI stage though.
I think this will now become SETGE_DX10, CNDE_INT, I'll probably improve
the handling of such cases to leave just a SETGE.
Vadim
>
> Unfortunately I won't be able to test with that system again until at
> least Thursday, so it'll be a while before I can actually do anything
> about it.
>
More information about the mesa-dev
mailing list