[Mesa-dev] [PATCH 05/14] glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields

Mon Oct 17 22:17:29 UTC 2016

On Mon, Oct 17, 2016 at 11:54 PM, Dave Airlie <airlied at gmail.com> wrote:
> On 18 October 2016 at 05:23, Marek Olšák <maraeo at gmail.com> wrote:
>> On Mon, Oct 17, 2016 at 4:44 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>>> Am 17.10.2016 um 15:39 schrieb Marek Olšák:
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> sizeof(glsl_to_tgsi_instruction): 464 -> 416
>>>> ---
>>>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 33 +++++++++++++++---------------
>>>>  1 file changed, 16 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> index 78d9409..b3654fe 100644
>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> @@ -263,42 +263,41 @@ st_dst_reg::st_dst_reg(st_src_reg reg)
>>>>     this->index2D = reg.index2D;
>>>>     this->reladdr2 = reg.reladdr2;
>>>>     this->has_index2 = reg.has_index2;
>>>>     this->array_id = reg.array_id;
>>>>  }
>>>>
>>>>  class glsl_to_tgsi_instruction : public exec_node {
>>>>  public:
>>>>     DECLARE_RALLOC_CXX_OPERATORS(glsl_to_tgsi_instruction)
>>>>
>>>> -   unsigned op;
>>>>     st_dst_reg dst[2];
>>>>     st_src_reg src[4];
>>>> -   /** Pointer to the ir source this tree came from for debugging */
>>>> -   ir_instruction *ir;
>>>> -   GLboolean cond_update;
>>>> -   bool saturate;
>>>> -   bool is_64bit_expanded;
>>>>     st_src_reg sampler; /**< sampler register */
>>>> -   int sampler_base;
>>>> -   int sampler_array_size; /**< 1-based size of sampler array, 1 if not array */
>>>> -   int tex_target; /**< One of TEXTURE_*_INDEX */
>>>> -   glsl_base_type tex_type;
>>>> -   GLboolean tex_shadow;
>>>> -   unsigned image_format;
>>>> -
>>>>     st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET];
>>>> -   unsigned tex_offset_num_offset;
>>>> -   int dead_mask; /**< Used in dead code elimination */
>>>> -
>>>>     st_src_reg buffer; /**< buffer register */
>>>> -   unsigned buffer_access; /**< buffer access type */
>>>> +
>>>> +   /** Pointer to the ir source this tree came from for debugging */
>>>> +   ir_instruction *ir;
>>>> +
>>>> +   unsigned op:8; /**< TGSI opcode */
>>> Maybe should throw in some static assert somewhere that TGSI_OPCODE_LAST
>>> is <= 255.
>>> Given how close we're to the limit I wouldn't quite bet on it staying 8
>>> bits forever (though of course it would need some changes elsewhere too).
>>
>> I'm adding this:
>>
>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> index c5cd382..4ad5e2c 100644
>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> @@ -663,6 +663,9 @@ glsl_to_tgsi_visitor::emit_asm(ir_instruction *ir,
>> unsigned op,
>>     }
>>     assert(num_reladdr == 0);
>>
>> +   /* inst->op has only 8 bits. */
>> +   STATIC_ASSERT(TGSI_OPCODE_LAST <= 255);
>> +
>>     inst->op = op;
>>     inst->info = tgsi_get_opcode_info(op);
>>     inst->dst[0] = dst;
>
> Just curious does it make the object size much bigger? My guess
> is this would increase CPU usage in this area which may be the opposite
> of what you want.

The motivation for the series was the high malloc call count for
glsl_to_tgsi_instruction, which I observed when I was working on the
GLSL stuff. Decreasing the size of that structure should help with
malloc overhead as well as cache utilization in theory.

Results with shader-db are below. Each measurement was done twice to
be sure the results are reproducible, and they are. The second set of
results is not shown here.

Before:
real    0m58.046s
user    3m48.464s
sys    0m0.652s

After:
real    0m56.709s
user    3m43.296s
sys    0m0.604s

Marek