[Mesa-dev] [PATCH 09/12] nir: Add lowering support for packing opcodes.

Mon Feb 1 18:38:42 UTC 2016

On Thu, Jan 28, 2016 at 11:44 PM, Iago Toral <itoral at igalia.com> wrote:
> On Thu, 2016-01-28 at 09:21 -0800, Matt Turner wrote:
>> On Thu, Jan 28, 2016 at 12:32 AM, Iago Toral <itoral at igalia.com> wrote:
>> > On Mon, 2016-01-25 at 15:18 -0800, Matt Turner wrote:
> (...)
>> >> diff --git a/src/glsl/nir/nir_opt_algebraic.py b/src/glsl/nir/nir_opt_algebraic.py
>> >> index b761b54..56b0f5e 100644
>> >> --- a/src/glsl/nir/nir_opt_algebraic.py
>> >> +++ b/src/glsl/nir/nir_opt_algebraic.py
>> >> @@ -258,6 +258,26 @@ optimizations = [
>> >>     (('extract_uword', a, b),
>> >>      ('iand', ('ushr', a, ('imul', b, 16)), 0xffff),
>> >>      'options->lower_extract_word'),
>> >> +
>> >> +    (('pack_unorm_2x16', 'v'),
>> >> +     ('pack_uvec2_to_uint',
>> >> +        ('f2u', ('fround_even', ('fmul', ('fsat', 'v'), 65535.0)))),
>> >> +     'options->lower_pack_unorm_2x16'),
>> >> +
>> >> +    (('pack_unorm_4x8', 'v'),
>> >> +     ('pack_uvec4_to_uint',
>> >> +        ('f2u', ('fround_even', ('fmul', ('fsat', 'v'), 255.0)))),
>> >> +     'options->lower_pack_unorm_4x8'),
>> >> +
>> >> +    (('pack_snorm_2x16', 'v'),
>> >> +     ('pack_uvec2_to_uint',
>> >> +        ('f2i', ('fround_even', ('fmul', ('fmin', 1.0, ('fmax', -1.0, 'v')), 32767.0)))),
>> >> +     'options->lower_pack_snorm_2x16'),
>> >> +
>> >> +    (('pack_snorm_4x8', 'v'),
>> >> +     ('pack_uvec4_to_uint',
>> >> +        ('f2i', ('fround_even', ('fmul', ('fmin', 1.0, ('fmax', -1.0, 'v')), 127.0)))),
>> >> +     'options->lower_pack_snorm_4x8'),
>> >
>> > I think the pack_snorm_* opcodes need a i2u conversion at the end.
>> > That's what the GLSL IR lowering is doing and also what the spec [1]
>> > seems to indicate:
>>
>> Right, but since NIR operands are typeless, there's nothing to do (NIR
>> doesn't even have i2u/u2i).
>
> I suppose that since these pack the incoming vector components into an
> uint it does not really matter in the end, since that won't affect the
> bits involved. Anyway, why not use f2u instead of f2i, seems like that
> would represent the semantics expected more accurately.

Changing it to f2u causes the fs-packsnorm2x16, fs-packsnorm4x8, and
vs-packsnorm2x16 tests to fail (presumably not vs-packsnorm4x8 because
of how it's directly implemented in the i965/vec4 backed).

This is the assembly diff of f2i -> f2u of the vs-packsnorm2x16 test:

-mov(8)          g16<1>.xyD      g15<4,4,1>.xyyyF
+mov(8)          g16<1>.xyUD     g15<4,4,1>.xyyyF
 shl(8)          g18<1>.xUD      g16<4,4,1>.yUD  0x00000010UD
 and(8)          g19<1>.xUD      g16<4,4,1>.xUD  0x0000ffffUD
 or(8)           g17<1>.xUD      g18<4,4,1>.xUD  g19<4,4,1>.xUD

I believe that negative floating-point values are converted into 0 if
the destination type is UD -- the 2.4.1 Float to Integer section in
the IVB PRM seems to confirm that behavior.