[Mesa-dev] Suboptimal code generation

Ilia Mirkin imirkin at alum.mit.edu
Fri Nov 14 11:39:32 PST 2014


On Fri, Nov 14, 2014 at 1:38 PM, Henri Verbeet <hverbeet at gmail.com> wrote:
> On 14 November 2014 18:50, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> I can't speak for the radeon guys, but I know I sure would love to see
>> any reports of poor code being generated by nouveau in response to
>> legitimate-seeming TGSI (or GLSL). In some cases, a simple
>> optimization can be added to take care of it, and I'd definitely
>> appreciate the extra pair of eyeballs on driver-generated code :)
>>
>> The report can be as simple as "here is the TGSI snippet, take a look
>> at how crappy the code it generates is". At least for nouveau, I can
>> feed that directly into a compiler that can target any of the relevant
>> backends.
>>
>> [Note, r600g didn't have an optimizer enabled until ~1y ago; not sure
>> if your analysis was with or without sb.]
>>
> It was with sb, but probably before TGSI got FSLT/FSGE/etc.
>
> For reference, what currently happens for r600g is something like this:
>
> D3D:
>     cnd r[0], r[0].w, c[1], c[2]
>
> GLSL:
>     R0.xyzw = (R0.w > 0.5 ? ps_c[1].xyzw : ps_c[2].xyzw);
>
> TGSI:
>     FSLT TEMP[0].x, IMM[0].xxxx, TEMP[0].xxxx
>     UIF TEMP[0].xxxx :0
>       MOV TEMP[0], CONST[1]
>     ELSE :0
>       MOV TEMP[0], CONST[2]
>     ENDIF
>
> R600:
>     SETGE_DX10         T0.x,  0.5, T0.x
>     CNDE_INT           R0.x,  T0.x, KC0[1].x, KC0[2].x
>     CNDE_INT           R0.y,  T0.x, KC0[1].y, KC0[2].y
>     CNDE_INT           R0.z,  T0.x, KC0[1].z, KC0[2].z
>     CNDE_INT           R0.w,  T0.x, KC0[1].w, KC0[2].w
>
> While ideally that would just be 4 CNDGE's, that's better than what I
> remember. IIRC there used to be a bunch of int/float conversions as
> well.

In the future, a full TGSI program would be preferred, since then it
can just be fed in... for this one (with a few assumptions baked in
about the immediate, where TEMP[0] comes from, etc), targeted to nvc0
(GF100):

00000000: fff01c06 06000000     ld b32 $r0 a[0x0] 0x0 unk39
00000008: fc01dc00 220e0000     set $p0 0x1 gt f32 $r0 0x0
00000010: 43f000c6 14000000     $p0 ld b128 $r0q c0[0x10]
00000018: 83f020c6 14000000     (not $p0) ld b128 $r0q c0[0x20]
00000020: 03f01c66 0a7e0000     st b128 a[0x0] $r0:$r1:$r2:$r3 0x0 unk39

Which seems pretty reasonable.

  -ilia


More information about the mesa-dev mailing list