[Mesa-dev] [PATCH] glsl_to_tgsi: indirect array information
Vadim Girlin
vadimgirlin at gmail.com
Tue Jan 22 16:21:16 PST 2013
On 01/23/2013 03:59 AM, Vincent Lejeune wrote:
>
>
> ----- Mail original -----
>> De : Vadim Girlin <vadimgirlin at gmail.com>
>> À : Christoph Bumiller <e0425955 at student.tuwien.ac.at>
>> Cc : mesa-dev at lists.freedesktop.org
>> Envoyé le : Mercredi 23 janvier 2013 0h44
>> Objet : Re: [Mesa-dev] [PATCH] glsl_to_tgsi: indirect array information
>>
>> On 01/22/2013 10:59 PM, Christoph Bumiller wrote:
>>> On 21.01.2013 21:10, Vadim Girlin wrote:
>>>> Provide the information about indirectly addressable arrays (ranges of
>> temps) in
>>>> the shader to the drivers. TGSI representation itself isn't
>> modified, array
>>>> information is passed as an additional data in the pipe_shader_state,
>> so the
>>>> drivers can use it as a hint for optimization.
>>>> ---
>>>>
>>>> It's far from being an ideal solution, but I saw the discussions
>> about that
>>>> problem starting from 2009 IIRC, and we still have no solution (neither
>> good
>>>> nor bad) despite the years passed. I hope we can use this not very
>> intrusive
>>>> approach until we get something better.
>>>>
>>>
>>> I'd rather not have any hacks in the interface, let alone ones that
>>> solve the problem only partially (you still won't know which array is
>>> accessed by a particular instruction, which is important for
>>> optimization and essential in some cases for making INPUT/OUTPUT arrays
>>> work), and not just because it reduces the pressure on people to
>>> implement a proper solution.
>>>
>>> With this, you just get to know which range of TEMPs are indirectly
>>> addressed and which ones are not, and you can do the same by simply
>>> creating multiple declarations of TEMPs, one for each array, and adding
>>> a single bit of info to tgsi_declaration (which has 7 bits of padding
>>> anyway, so ample space), which is a lot less ugly, and doesn't suffer
>>> from an arbitrary limit, and doesn't require any modification of
>> drivers
>>> either.
>>>
>>
>> Array accessed by any indirect operand can be identified by the
>> immediate offset, e.g. TEMP[ADDR[0].x+1] implies the array starting from
>> 1, thus we can find it's entry in the information provided by this patch
>> to get the addressable range for every indirect operand. If I'm not
>> missing something, glsl_to_tgsi accumulates all other parts of the
>> offset in the address register before the indirect access. If I'm wrong,
>> we can fix it to ensure such behavior.
>
> I'm not sure about that ; when I worked on indirect addressing of const memory,
> I discovered when tracking vp/fo regression that the immediate offset is the result of
> glsl_to_tgsi constant propagation and not related to the underlying array.
> This means that the dynamic index can be negative, which is not always
> desirable depending on the hw. (In R600 case, const fetch instruction does not
> support negative index. MOVA inst does).
>
> For instance, the following pseudo code snippet is fine for an index value of -4 :
>
> uniform int index;
>
> float array[4];
> float data = array[6 + index];
>
> and is lowered to
> MOV TEMP[0] TEMP[ADDR[0].x + 6];
>
I tried the following shader:
> uniform int index;
>
> void main()
> {
> float array[4] = float[4](0.1, 0.2, 0.3, 0.4);
> float data = array[6 + index];
> gl_FragColor = vec4(data, 1.0, 0.0, 1.0);
> }
Resulting TGSI:
> --------------------------------------------------------------
> FRAG
> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
> DCL OUT[0], COLOR
> DCL CONST[0]
> DCL TEMP[0], LOCAL
> DCL TEMP[1], LOCAL
> DCL TEMP[2], LOCAL
> DCL TEMP[3], LOCAL
> DCL TEMP[4], LOCAL
> DCL TEMP[5], LOCAL
> DCL TEMP[6], LOCAL
> DCL TEMP[7], LOCAL
> DCL ADDR[0]
> IMM[0] FLT32 { 0.1000, 0.2000, 0.3000, 0.4000}
> IMM[1] FLT32 { 1.0000, 0.0000, 0.0000, 0.0000}
> IMM[2] INT32 {6, 0, 0, 0}
> 0: MOV TEMP[1].yzw, IMM[1].yxyx
> 1: MOV TEMP[2], IMM[0].xxxx
> 2: MOV TEMP[3], IMM[0].yyyy
> 3: MOV TEMP[4], IMM[0].zzzz
> 4: MOV TEMP[5], IMM[0].wwww
> 5: UADD TEMP[6].x, IMM[2].xxxx, CONST[0].xxxx
> 6: UARL ADDR[0].x, TEMP[6].xxxx
> 7: MOV TEMP[1].x, TEMP[ADDR[0].x+2].xxxx
> 8: MOV_SAT OUT[0], TEMP[1]
> 9: END
> --------------------------------------------------------------
Also I tried the following:
> uniform float array[4];
> uniform int index;
>
> void main()
> {
> float data = array[6 + index];
> gl_FragColor = vec4(data, 1.0, 0.0, 1.0);
> }
Resulting TGSI:
> --------------------------------------------------------------
> FRAG
> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
> DCL OUT[0], COLOR
> DCL CONST[0..4]
> DCL TEMP[0], LOCAL
> DCL TEMP[1], LOCAL
> DCL ADDR[0]
> IMM[0] FLT32 { 1.0000, 0.0000, 0.0000, 0.0000}
> IMM[1] INT32 {6, 0, 0, 0}
> 0: MOV TEMP[0].yzw, IMM[0].yxyx
> 1: UADD TEMP[1].x, IMM[1].xxxx, CONST[0].xxxx
> 2: UARL ADDR[0].x, TEMP[1].xxxx
> 3: MOV TEMP[0].x, CONST[ADDR[0].x+1].xxxx
> 4: MOV_SAT OUT[0], TEMP[0]
> 5: END
> --------------------------------------------------------------
So far immediate offset in the indirect operand is always equal to the
start offset of the array. Could you provide some more complete example
that demonstrates the problem, please.
Vadim
> I didn't test your patch atm, but I think you may have to fix glsl_to_tgsi.
> Otherwise I'm in favor of implementing something not optimal but far
> better that what we have currently.
>
> Vincent
>
>>
>> I'll be perfectly OK with any other solution, as long as it's a really
>> working (already implemented) solution that I can use today, not just
>> some abstract ideas in the discussions. This patch isn't perfect and can
>> be improved, but it already works for me. I'll be very happy to use any
>> other solution from you or anyone else.
>>
>> Vadim
>>
>>>> src/gallium/auxiliary/tgsi/tgsi_ureg.c | 2 ++
>>>> src/gallium/include/pipe/p_state.h | 13 +++++++
>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 57
>> +++++++++++++++++++++++++-----
>>>> src/mesa/state_tracker/st_program.c | 4 +++
>>>> src/mesa/state_tracker/st_program.h | 1 +
>>>> 5 files changed, 69 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
>> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
>>>> index 3c2a923..61db431 100644
>>>> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
>>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
>>>> @@ -1658,6 +1658,8 @@ void *ureg_create_shader( struct ureg_program
>> *ureg,
>>>> else
>>>> memset(&state.stream_output, 0,
>> sizeof(state.stream_output));
>>>>
>>>> + memset(&state.array_info, 0, sizeof(state.array_info));
>>>> +
>>>> if (ureg->processor == TGSI_PROCESSOR_VERTEX)
>>>> return pipe->create_vs_state( pipe, &state );
>>>> else
>>>> diff --git a/src/gallium/include/pipe/p_state.h
>> b/src/gallium/include/pipe/p_state.h
>>>> index ab49cab..4490f2e 100644
>>>> --- a/src/gallium/include/pipe/p_state.h
>>>> +++ b/src/gallium/include/pipe/p_state.h
>>>> @@ -65,6 +65,8 @@ extern "C" {
>>>> #define PIPE_MAX_TEXTURE_LEVELS 16
>>>> #define PIPE_MAX_SO_BUFFERS 4
>>>>
>>>> +#define PIPE_MAX_INDIRECT_ARRAYS 16
>>>> +
>>>>
>>>> struct pipe_reference
>>>> {
>>>> @@ -205,11 +207,22 @@ struct pipe_stream_output_info
>>>> } output[PIPE_MAX_SHADER_OUTPUTS];
>>>> };
>>>>
>>>> +struct pipe_shader_indirect_array {
>>>> + unsigned index:16;
>>>> + unsigned size:16;
>>>> +};
>>>> +
>>>> +struct pipe_shader_array_info
>>>> +{
>>>> + struct pipe_shader_indirect_array arrays[PIPE_MAX_INDIRECT_ARRAYS];
>>>> + unsigned num_arrays;
>>>> +};
>>>>
>>>> struct pipe_shader_state
>>>> {
>>>> const struct tgsi_token *tokens;
>>>> struct pipe_stream_output_info stream_output;
>>>> + struct pipe_shader_array_info array_info;
>>>> };
>>>>
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> index 1d96e90..4ded1be 100644
>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> @@ -110,6 +110,7 @@ public:
>>>> this->index2D = 0;
>>>> this->type = type ? type->base_type : GLSL_TYPE_ERROR;
>>>> this->reladdr = NULL;
>>>> + this->size = 0;
>>>> }
>>>>
>>>> st_src_reg(gl_register_file file, int index, int type)
>>>> @@ -121,6 +122,7 @@ public:
>>>> this->swizzle = SWIZZLE_XYZW;
>>>> this->negate = 0;
>>>> this->reladdr = NULL;
>>>> + this->size = 0;
>>>> }
>>>>
>>>> st_src_reg(gl_register_file file, int index, int type, int
>> index2D)
>>>> @@ -132,6 +134,7 @@ public:
>>>> this->swizzle = SWIZZLE_XYZW;
>>>> this->negate = 0;
>>>> this->reladdr = NULL;
>>>> + this->size = 0;
>>>> }
>>>>
>>>> st_src_reg()
>>>> @@ -143,6 +146,7 @@ public:
>>>> this->swizzle = 0;
>>>> this->negate = 0;
>>>> this->reladdr = NULL;
>>>> + this->size = 0;
>>>> }
>>>>
>>>> explicit st_src_reg(st_dst_reg reg);
>>>> @@ -155,6 +159,7 @@ public:
>>>> int type; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
>>>> /** Register index should be offset by the integer in this reg. */
>>>> st_src_reg *reladdr;
>>>> + int size;
>>>> };
>>>>
>>>> class st_dst_reg {
>>>> @@ -244,8 +249,9 @@ public:
>>>>
>>>> class variable_storage : public exec_node {
>>>> public:
>>>> - variable_storage(ir_variable *var, gl_register_file file, int
>> index)
>>>> - : file(file), index(index), var(var)
>>>> + variable_storage(ir_variable *var, gl_register_file file, int
>> index,
>>>> + int size = 0)
>>>> + : file(file), index(index), var(var), size(size)
>>>> {
>>>> /* empty */
>>>> }
>>>> @@ -253,6 +259,7 @@ public:
>>>> gl_register_file file;
>>>> int index;
>>>> ir_variable *var; /* variable that maps to this, if any */
>>>> + int size;
>>>> };
>>>>
>>>> class immediate_storage : public exec_node {
>>>> @@ -312,6 +319,7 @@ public:
>>>> struct gl_program *prog;
>>>> struct gl_shader_program *shader_program;
>>>> struct gl_shader_compiler_options *options;
>>>> + struct pipe_shader_array_info array_info;
>>>>
>>>> int next_temp;
>>>>
>>>> @@ -445,6 +453,8 @@ public:
>>>> void emit_block_mov(ir_assignment *ir, const struct glsl_type
>> *type,
>>>> st_dst_reg *l, st_src_reg *r);
>>>>
>>>> + void add_array_info(const st_src_reg *src);
>>>> +
>>>> void *mem_ctx;
>>>> };
>>>>
>>>> @@ -1004,7 +1014,8 @@ glsl_to_tgsi_visitor::get_temp(const glsl_type
>> *type)
>>>> src.file = PROGRAM_TEMPORARY;
>>>> src.index = next_temp;
>>>> src.reladdr = NULL;
>>>> - next_temp += type_size(type);
>>>> + src.size = type_size(type);
>>>> + next_temp += src.size;
>>>>
>>>> if (type->is_array() || type->is_record()) {
>>>> src.swizzle = SWIZZLE_NOOP;
>>>> @@ -1075,9 +1086,10 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
>>>> assert((int) ir->num_state_slots ==
>> type_size(ir->type));
>>>>
>>>> storage = new(mem_ctx) variable_storage(ir,
>> PROGRAM_TEMPORARY,
>>>> - this->next_temp);
>>>> + this->next_temp,
>>>> +
>> type_size(ir->type));
>>>> this->variables.push_tail(storage);
>>>> - this->next_temp += type_size(ir->type);
>>>> + this->next_temp += storage->size;
>>>>
>>>> dst = st_dst_reg(st_src_reg(PROGRAM_TEMPORARY,
>> storage->index,
>>>> native_integers ? ir->type->base_type :
>> GLSL_TYPE_FLOAT));
>>>> @@ -2029,10 +2041,10 @@
>> glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
>>>> case ir_var_auto:
>>>> case ir_var_temporary:
>>>> entry = new(mem_ctx) variable_storage(var,
>> PROGRAM_TEMPORARY,
>>>> - this->next_temp);
>>>> + this->next_temp,
>>>> +
>> type_size(var->type));
>>>> this->variables.push_tail(entry);
>>>> -
>>>> - next_temp += type_size(var->type);
>>>> + next_temp += entry->size;
>>>> break;
>>>> }
>>>>
>>>> @@ -2043,6 +2055,8 @@
>> glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
>>>> }
>>>>
>>>> this->result = st_src_reg(entry->file, entry->index,
>> var->type);
>>>> + this->result.size = entry->size;
>>>> +
>>>> if (!native_integers)
>>>> this->result.type = GLSL_TYPE_FLOAT;
>>>> }
>>>> @@ -2062,12 +2076,16 @@
>> glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
>>>> if (index) {
>>>> src.index += index->value.i[0] * element_size;
>>>> } else {
>>>> +
>>>> /* Variable index array dereference. It eats the
>> "vec4" of the
>>>> * base of the array and an index that offsets the TGSI
>> register
>>>> * index.
>>>> */
>>>> ir->array_index->accept(this);
>>>>
>>>> + if (src.size > 1 && src.file == PROGRAM_TEMPORARY)
>>>> + add_array_info(&src);
>>>> +
>>>> st_src_reg index_reg;
>>>>
>>>> if (element_size == 1) {
>>>> @@ -2969,6 +2987,7 @@ glsl_to_tgsi_visitor::glsl_to_tgsi_visitor()
>>>> prog = NULL;
>>>> shader_program = NULL;
>>>> options = NULL;
>>>> + array_info.num_arrays = 0;
>>>> }
>>>>
>>>> glsl_to_tgsi_visitor::~glsl_to_tgsi_visitor()
>>>> @@ -3771,6 +3790,23 @@ glsl_to_tgsi_visitor::renumber_registers(void)
>>>> this->next_temp = new_index;
>>>> }
>>>>
>>>> +void
>>>> +glsl_to_tgsi_visitor::add_array_info(const st_src_reg *src)
>>>> +{
>>>> + unsigned i;
>>>> +
>>>> + if (array_info.num_arrays == PIPE_MAX_INDIRECT_ARRAYS)
>>>> + return;
>>>> +
>>>> + for (i = 0; i < array_info.num_arrays; ++i) {
>>>> + if (array_info.arrays[i].index == (unsigned)src->index)
>>>> + return;
>>>> + }
>>>> + array_info.arrays[i].index = src->index;
>>>> + array_info.arrays[i].size = src->size;
>>>> + ++array_info.num_arrays;
>>>> +}
>>>> +
>>>> /**
>>>> * Returns a fragment program which implements the current pixel
>> transfer ops.
>>>> * Based on get_pixel_transfer_program in st_atom_pixeltransfer.c.
>>>> @@ -4798,6 +4834,11 @@ st_translate_program(
>>>> }
>>>>
>>>> if (program->indirect_addr_temps) {
>>>> + struct pipe_shader_array_info *ai =
>>>> + &((struct st_fragment_program*) proginfo)->array_info;
>>>> +
>>>> + memcpy(ai, &program->array_info, sizeof(struct
>> pipe_shader_array_info));
>>>> +
>>>> /* If temps are accessed with indirect addressing, declare
>> temporaries
>>>> * in sequential order. Else, we declare them on demand
>> elsewhere.
>>>> * (Note: the number of temporaries is equal to
>> program->next_temp)
>>>> diff --git a/src/mesa/state_tracker/st_program.c
>> b/src/mesa/state_tracker/st_program.c
>>>> index a9111b5..1370c76 100644
>>>> --- a/src/mesa/state_tracker/st_program.c
>>>> +++ b/src/mesa/state_tracker/st_program.c
>>>> @@ -739,6 +739,10 @@ st_translate_fragment_program(struct st_context
>> *st,
>>>> fs_output_semantic_index, FALSE,
>>>> key->clamp_color);
>>>>
>>>> + if (stfp->glsl_to_tgsi)
>>>> + memcpy(&variant->tgsi.array_info,
>> &stfp->array_info,
>>>> + sizeof(struct pipe_shader_array_info));
>>>> +
>>>> variant->tgsi.tokens = ureg_get_tokens( ureg, NULL );
>>>> ureg_destroy( ureg );
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_program.h
>> b/src/mesa/state_tracker/st_program.h
>>>> index 23a262c..d6b2b0f 100644
>>>> --- a/src/mesa/state_tracker/st_program.h
>>>> +++ b/src/mesa/state_tracker/st_program.h
>>>> @@ -90,6 +90,7 @@ struct st_fragment_program
>>>> {
>>>> struct gl_fragment_program Base;
>>>> struct glsl_to_tgsi_visitor* glsl_to_tgsi;
>>>> + struct pipe_shader_array_info array_info;
>>>>
>>>> struct st_fp_variant *variants;
>>>> };
>>>>
>>>
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
More information about the mesa-dev
mailing list