[Mesa-dev] [PATCH 3/9] glsl, nir: Make ir_quadop_bitfield_insert a vectorized operation.
Ilia Mirkin
imirkin at alum.mit.edu
Mon Jan 11 15:25:44 PST 2016
On Mon, Jan 11, 2016 at 6:18 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> On Mon, Jan 11, 2016 at 6:13 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> On Mon, Jan 11, 2016 at 2:57 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>> On Mon, Jan 11, 2016 at 5:48 PM, Matt Turner <mattst88 at gmail.com> wrote:
>>>> From: Kenneth Graunke <kenneth at whitecape.org>
>>>>
>>>> We would like to be able to combine
>>>>
>>>> result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
>>>> result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
>>>> result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
>>>> result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);
>>>>
>>>> into a single ivec4 bitfieldInsert operation. This should be possible
>>>> with most drivers.
>>>>
>>>> This patch changes the offset and bits parameters from scalar ints
>>>> to ivecN or uvecN. The type of all four operands will be the same,
>>>> for simplicity.
>>>>
>>>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>>>> Reviewed-by: Matt Turner <mattst88 at gmail.com>
>>>> ---
>>>> [mattst88] v2:
>>>> Use swizzle(expr, SWIZZLE_XXXX, type->vector_elements) instead of
>>>> swizzle_for_size(expr, type->vector_elements). The latter does not
>>>> provide the wanted operation of expanding a scalar.
>>>>
>>>> Expand arguments to bitfield_insert() in ldexp() lowering passes.
>>>>
>>>> src/glsl/builtin_functions.cpp | 8 +++++++-
>>>> src/glsl/ir.h | 1 -
>>>> src/glsl/ir_constant_expression.cpp | 6 +++---
>>>> src/glsl/ir_validate.cpp | 5 +++--
>>>> src/glsl/lower_instructions.cpp | 8 ++++----
>>>> src/glsl/nir/nir_opcodes.py | 4 ++--
>>>> 6 files changed, 19 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
>>>> index 602852a..38383bc 100644
>>>> --- a/src/glsl/builtin_functions.cpp
>>>> +++ b/src/glsl/builtin_functions.cpp
>>>> @@ -4902,13 +4902,19 @@ builtin_builder::_bitfieldExtract(const glsl_type *type)
>>>> ir_function_signature *
>>>> builtin_builder::_bitfieldInsert(const glsl_type *type)
>>>> {
>>>> + bool is_uint = type->base_type == GLSL_TYPE_UINT;
>>>> ir_variable *base = in_var(type, "base");
>>>> ir_variable *insert = in_var(type, "insert");
>>>> ir_variable *offset = in_var(glsl_type::int_type, "offset");
>>>> ir_variable *bits = in_var(glsl_type::int_type, "bits");
>>>> MAKE_SIG(type, gpu_shader5_or_es31, 4, base, insert, offset, bits);
>>>>
>>>> - body.emit(ret(bitfield_insert(base, insert, offset, bits)));
>>>> + operand cast_offset = is_uint ? i2u(offset) : operand(offset);
>>>> + operand cast_bits = is_uint ? i2u(bits) : operand(bits);
>>>> +
>>>> + body.emit(ret(bitfield_insert(base, insert,
>>>> + swizzle(cast_offset, SWIZZLE_XXXX, type->vector_elements),
>>>> + swizzle(cast_bits, SWIZZLE_XXXX, type->vector_elements))));
>>>>
>>>> return sig;
>>>> }
>>>> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
>>>> index a2eb508..9af2fc1 100644
>>>> --- a/src/glsl/ir.h
>>>> +++ b/src/glsl/ir.h
>>>> @@ -1710,7 +1710,6 @@ public:
>>>> operation == ir_triop_vector_insert ||
>>>> operation == ir_quadop_vector ||
>>>> /* TODO: these can't currently be vectorized */
>>>> - operation == ir_quadop_bitfield_insert ||
>>>> operation == ir_triop_bitfield_extract;
>>>> }
>>>>
>>>> diff --git a/src/glsl/ir_constant_expression.cpp b/src/glsl/ir_constant_expression.cpp
>>>> index 38b6dd5..f5b5bd8 100644
>>>> --- a/src/glsl/ir_constant_expression.cpp
>>>> +++ b/src/glsl/ir_constant_expression.cpp
>>>> @@ -1710,10 +1710,10 @@ ir_expression::constant_expression_value(struct hash_table *variable_context)
>>>> }
>>>>
>>>> case ir_quadop_bitfield_insert: {
>>>> - int offset = op[2]->value.i[0];
>>>> - int bits = op[3]->value.i[0];
>>>> -
>>>> for (unsigned c = 0; c < components; c++) {
>>>> + int offset = op[2]->value.i[c];
>>>> + int bits = op[3]->value.i[c];
>>>> +
>>>> if (bits == 0)
>>>> data.u[c] = op[0]->value.u[c];
>>>> else if (offset < 0 || bits < 0)
>>>> diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
>>>> index a4d6182..fea9b76 100644
>>>> --- a/src/glsl/ir_validate.cpp
>>>> +++ b/src/glsl/ir_validate.cpp
>>>> @@ -647,10 +647,11 @@ ir_validate::visit_leave(ir_expression *ir)
>>>> break;
>>>>
>>>> case ir_quadop_bitfield_insert:
>>>> + assert(ir->type->is_integer());
>>>> assert(ir->operands[0]->type == ir->type);
>>>> assert(ir->operands[1]->type == ir->type);
>>>> - assert(ir->operands[2]->type == glsl_type::int_type);
>>>> - assert(ir->operands[3]->type == glsl_type::int_type);
>>>> + assert(ir->operands[2]->type == ir->type);
>>>> + assert(ir->operands[3]->type == ir->type);
>>>
>>> This will break lower_packing_builtins::pack_uvec2_to_uint and the
>>> such. (In other news, I highly encourage you to set
>>> LOWER_PACK_USE_BFI/BFE).
>>
>> I tried setting those flags, and it doesn't break, and I get bfi/bfe
>> instructions in the assembly.
>
> Did you have assertions enabled?
>
> if (op_mask & LOWER_PACK_USE_BFI) {
> return bitfield_insert(bit_and(swizzle_x(u), constant(0xffffu)),
> swizzle_y(u),
> constant(16),
> constant(16));
>
> I don't see how that wouldn't trigger the problem. The constants are
> int while ir->type is uint...
>
>>
>> Can you reproduce the problem you're noting?
>
> The above was from code inspection. But I'll double-check with actual
> compiled code :)
>
> -ilia
Here you go:
(gdb) r
Starting program: /home/ilia/src/piglit/bin/shader_runner
generated_tests/spec/arb_shading_language_packing/execution/built-in-functions/fs-packUnorm4x8.shader_test
-fbo -auto
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
shader_runner: ir_validate.cpp:654: virtual ir_visitor_status
{anonymous}::ir_validate::visit_leave(ir_expression*): Assertion
`ir->operands[2]->type == ir->type' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff55dd3b7 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff55dd3b7 in raise () from /lib64/libc.so.6
#1 0x00007ffff55de74a in abort () from /lib64/libc.so.6
#2 0x00007ffff55d63ed in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff55d64a2 in __assert_fail () from /lib64/libc.so.6
#4 0x00007fffeff82f1c in (anonymous
namespace)::ir_validate::visit_leave (this=<optimized out>,
ir=<optimized out>)
at ir_validate.cpp:654
#5 0x00007fffeff7e32a in ir_expression::accept (this=0x70e9d0,
v=0x7fffffffd960) at ir_hv_accept.cpp:141
#6 0x00007fffeff7e32a in ir_expression::accept (this=0x72be90,
v=0x7fffffffd960) at ir_hv_accept.cpp:141
#7 0x00007fffeff7e696 in ir_assignment::accept (this=0x73a010,
v=0x7fffffffd960) at ir_hv_accept.cpp:304
#8 0x00007fffeff7e0c6 in visit_list_elements
(v=v at entry=0x7fffffffd960, l=l at entry=0x6f9908,
statement_list=statement_list at entry=true) at ir_hv_accept.cpp:55
#9 0x00007fffeff7e227 in ir_function_signature::accept
(this=0x6f98c0, v=0x7fffffffd960) at ir_hv_accept.cpp:115
#10 0x00007fffeff7e0c6 in visit_list_elements
(v=v at entry=0x7fffffffd960, l=l at entry=0x7070a8,
statement_list=statement_list at entry=false) at ir_hv_accept.cpp:55
#11 0x00007fffeff7e29e in ir_function::accept (this=0x707080,
v=0x7fffffffd960) at ir_hv_accept.cpp:127
#12 0x00007fffeff7e0c6 in visit_list_elements
(v=v at entry=0x7fffffffd960, l=l at entry=0x70f440,
statement_list=statement_list at entry=true) at ir_hv_accept.cpp:55
#13 0x00007fffeff7defa in ir_hierarchical_visitor::run
(this=this at entry=0x7fffffffd960,
instructions=instructions at entry=0x70f440) at ir_hierarchical_visitor.cpp:364
#14 0x00007fffeff8421b in validate_ir_tree
(instructions=instructions at entry=0x70f440) at ir_validate.cpp:924
#15 0x00007fffefede348 in st_link_shader (ctx=0x66c2d0, prog=0x6fa9e0)
at state_tracker/st_glsl_to_tgsi.cpp:5923
#16 0x00007fffefef5d66 in _mesa_glsl_link_shader (ctx=0x66c2d0,
prog=0x6fa9e0) at program/ir_to_mesa.cpp:2962
#17 0x00007fffefe1adca in link_program (ctx=0x66c2d0,
program=<optimized out>) at main/shaderapi.c:1047
#18 0x00007ffff7acbdb2 in stub_glLinkProgram (program=3) at
/home/ilia/src/piglit/tests/util/piglit-dispatch-gen.c:32599
#19 0x000000000040776d in link_and_use_shaders () at
/home/ilia/src/piglit/tests/shaders/shader_runner.c:1042
#20 0x000000000040df52 in piglit_init (argc=2, argv=0x7fffffffdcf8)
at /home/ilia/src/piglit/tests/shaders/shader_runner.c:3270
#21 0x00007ffff7b3a75e in run_test (gl_fw=0x616010, argc=2, argv=0x7fffffffdcf8)
at /home/ilia/src/piglit/tests/util/piglit-framework-gl/piglit_fbo_framework.c:50
#22 0x00007ffff7b20ac5 in piglit_gl_test_run (argc=2,
argv=0x7fffffffdcf8, config=0x7fffffffdbb0)
at /home/ilia/src/piglit/tests/util/piglit-framework-gl.c:199
#23 0x0000000000405b60 in main (argc=2, argv=0x7fffffffdcf8) at
/home/ilia/src/piglit/tests/shaders/shader_runner.c:54
(gdb) up
#1 0x00007ffff55de74a in abort () from /lib64/libc.so.6
(gdb)
#2 0x00007ffff55d63ed in __assert_fail_base () from /lib64/libc.so.6
(gdb)
#3 0x00007ffff55d64a2 in __assert_fail () from /lib64/libc.so.6
(gdb)
#4 0x00007fffeff82f1c in (anonymous
namespace)::ir_validate::visit_leave (this=<optimized out>,
ir=<optimized out>)
at ir_validate.cpp:654
654 assert(ir->operands[2]->type == ir->type);
(gdb) l
649
650 case ir_quadop_bitfield_insert:
651 assert(ir->type->is_integer());
652 assert(ir->operands[0]->type == ir->type);
653 assert(ir->operands[1]->type == ir->type);
654 assert(ir->operands[2]->type == ir->type);
655 assert(ir->operands[3]->type == ir->type);
656 break;
657
658 case ir_quadop_vector:
More information about the mesa-dev
mailing list