[Lima] [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

Wed Jan 2 16:49:48 UTC 2019

On Wed, Jan 2, 2019 at 11:17 AM Jason Ekstrand <jason at jlekstrand.net> wrote:
>
> On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>
>> Have a look at the first 4 patches in the series from Jonathan Marek
>> to address some of these issues:
>>
>> https://patchwork.freedesktop.org/series/54295/
>>
>> Not sure exactly what state that work is in, but I've added Jonathan
>> to CC, perhaps he can provide an update.
>>
>> Cheers,
>>
>>   -ilia
>>
>> On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu <yuq825 at gmail.com> wrote:
>> >
>> > Hi guys,
>> >
>> > I found the problem with this test fragment shader when lima development:
>> > uniform int color;
>> > void main() {
>> >     if (color > 1)
>> >         gl_FragColor = vec4(1.0, 0.0, 0.0, 1);
>> >     else
>> >         gl_FragColor = vec4(0.0, 1.0, 0.0, 1);
>> > }
>> >
>> > nir_print_shader output:
>> > impl main {
>> >         block block_0:
>> >         /* preds: */
>> >         vec1 32 ssa_0 = load_const (0x00000001 /* 0.000000 */)
>> >         vec4 32 ssa_1 = load_const (0x3f800000 /* 1.000000 */,
>> > 0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /*
>> > 1.000000 */)
>> >         vec4 32 ssa_2 = load_const (0x00000000 /* 0.000000 */,
>> > 0x3f800000 /* 1.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 /*
>> > 1.000000 */)
>> >         vec1 32 ssa_3 = load_const (0x00000000 /* 0.000000 */)
>> >         vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, 0) /*
>> > base=0 */ /* range=1 */ /* component=0 */   /* color */
>> >         vec1 32 ssa_5 = slt ssa_0, ssa_4
>> >         vec1 32 ssa_6 = fnot ssa_5
>> >         vec4 32 ssa_7 = bcsel ssa_6.xxxx, ssa_2, ssa_1
>> >         intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* base=0 */
>> > /* wrmask=xyzw */ /* component=0 */       /* gl_FragColor */
>> >         /* succs: block_1 */
>> >         block block_1:
>> > }
>> >
>> > ssa0 is not converted to float when glsl to nir. I see glsl_to_nir.cpp
>> > will create flt/ilt/ult
>> > based on source type for gpu support native integer, but for gpu not
>> > support native
>> > integer, just create slt for all source type. And in
>> > nir_lower_constant_initializers,
>> > there's also no type conversion for integer constant.
>
>
> This is a generally sticky issue.  In NIR, we have no concept of types on SSA values which has proven perfectly reasonable and actually very powerful in a world where integers are supported natively.  Unfortunately, it causes significant problems for float-only architectures.  There are two general possible solutions:
>
>  1. convert all integers to floats in glsl_to_nir and prog_to_nir and adjust various lowering/optimization passes to handle nir_compiler_options::supports_native_integers == false.
>
>  2. Just allow integers all the way through until you get very close to the end and then lower integers to floats at the last possible moment.
>
> Both of these come with significant issues.  With the first approach, there are potentially a lot of passes that will need to be adjusted and it's not 100% clear what to do with UBO offsets and indirect addressing; fortunately, you should be able to disable most of those optimizations to get going so it shouldn't be too bad.  The second would be less invasive to NIR because it doesn't require modifying as many passes.  However, doing such a lowering would be very tricky to get right primarily because of constants.  With everything else, you can just sort of assume that inputs are floats (you lowered, right?) and lower to produce a float output.  With constants, however, you don't know whether or not it's an integer that needs lowering.  We could, in theory, add an extra bit to load_const to solve this problem but there are also significant problems with doing that so it's not clear it's a good idea.
>
> I think the patches from Marek (as indicated by ilia) attempt the first approach.  If we can do it practically, my suspicion is that the first will work better than the second.  However, it will take some experimentation in order to actually determine that.

Just a handful of thoughts, which I suspect Jason is already well
aware of, but I'll point them out anyways:

 - On nv30, uniform indirect addressing consumes a float value
directly. Other architectures have a dedicated "address register" used
for such things, which needs to be loaded with a float (but is
internally stored in an opaque manner). However direct addressing
takes an integer, and indirect addressing may be able to take a "bias"
integer offset.
 - All uniform values should be coming in as float, even if they say
"int" or "bool" on them -- this is what mesa core already does.
 - Bools (consumed by branches, etc) are awkward everywhere. I think
the recent rework to make them 1-bit generics enables the awkwardness
to be passed on to the backend where it belongs, which means everyone
can just handle it however they want.

The main point I'm trying to make is that there are only a handful of
practical architectures that don't support integers, and they tend to
be very limited in their featureset (ES 2.0, GL 2.1, DX9) -- no UBOs,
no TBOs, no stream-out, no bitwise ops, no shifts.

Cheers,

  -ilia