[Mesa-dev] [PATCH] glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand

Roland Scheidegger sroland at vmware.com
Mon Aug 25 17:56:32 PDT 2014


Am 26.08.2014 01:58, schrieb Ian Romanick:
> On 08/24/2014 02:51 PM, Marek Olšák wrote:
>> On Sun, Aug 24, 2014 at 6:15 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>>> On Sun, Aug 24, 2014 at 7:44 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>
>>>> This fixes crashes if the number of temporaries is greater than 4096.
>>>>
>>>> Bugzilla: https://urldefense.proofpoint.com/v1/url?u=https://bugs.freedesktop.org/show_bug.cgi?id%3D66184&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=gfEn35xf0kjjI8n9WaXhKdVQTE32gR5jrqw6zN8dRdc%3D%0A&s=612b9875acfe89387a48f7234f1c43749cfb9b838f3d66f750cf0058335f3423
>>>> Cc: 10.2 10.3 mesa-stable at lists.freedesktop.org
>>>> ---
>>>>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 45 ++++++++++++++++++------------
>>>>  1 file changed, 27 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> index 84bdc4f..9beecf8 100644
>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>>>> @@ -75,14 +75,6 @@ extern "C" {
>>>>                             (1 << PROGRAM_UNIFORM))
>>>>
>>>>  /**
>>>> - * Maximum number of temporary registers.
>>>> - *
>>>> - * It is too big for stack allocated arrays -- it will cause stack overflow on
>>>> - * Windows and likely Mac OS X.
>>>> - */
>>>> -#define MAX_TEMPS         4096
>>>> -
>>>> -/**
>>>>   * Maximum number of arrays
>>>>   */
>>>>  #define MAX_ARRAYS        256
>>>> @@ -3301,14 +3293,10 @@ get_src_arg_mask(st_dst_reg dst, st_src_reg src)
>>>>  void
>>>>  glsl_to_tgsi_visitor::simplify_cmp(void)
>>>>  {
>>>> -   unsigned *tempWrites;
>>>> +   int tempWritesSize = 0;
>>>> +   unsigned *tempWrites = NULL;
>>>>     unsigned outputWrites[MAX_PROGRAM_OUTPUTS];
>>>>
>>>> -   tempWrites = new unsigned[MAX_TEMPS];
>>>> -   if (!tempWrites) {
>>>> -      return;
>>>> -   }
>>>> -   memset(tempWrites, 0, sizeof(unsigned) * MAX_TEMPS);
>>>>     memset(outputWrites, 0, sizeof(outputWrites));
>>>>
>>>>     foreach_in_list(glsl_to_tgsi_instruction, inst, &this->instructions) {
>>>> @@ -3330,7 +3318,16 @@ glsl_to_tgsi_visitor::simplify_cmp(void)
>>>>           prevWriteMask = outputWrites[inst->dst.index];
>>>>           outputWrites[inst->dst.index] |= inst->dst.writemask;
>>>>        } else if (inst->dst.file == PROGRAM_TEMPORARY) {
>>>> -         assert(inst->dst.index < MAX_TEMPS);
>>>> +         if (inst->dst.index >= tempWritesSize) {
>>>> +            const int inc = 4096;
>>>> +
>>>> +            tempWrites = (unsigned*)
>>>> +                         realloc(tempWrites,
>>>> +                                 (tempWritesSize + inc) * sizeof(unsigned));
>>>
>>> This can fail, the old code dealt with new returning NULL. Not sure
>>> what the general policy for that is in mesa, but I have seen a lot of
>>> allocation checks. Probably makes sense here too.
>>>
>>>> +            memset(tempWrites + tempWritesSize, 0, inc * sizeof(unsigned));
>>>> +            tempWritesSize += inc;
>>>> +         }
>>>> +
>>>>           prevWriteMask = tempWrites[inst->dst.index];
>>>>           tempWrites[inst->dst.index] |= inst->dst.writemask;
>>>>        } else
>>>> @@ -3349,7 +3346,7 @@ glsl_to_tgsi_visitor::simplify_cmp(void)
>>>>        }
>>>>     }
>>>>
>>>> -   delete [] tempWrites;
>>>> +   free(tempWrites);
>>>>  }
>>>>
>>>>  /* Replaces all references to a temporary register index with another index. */
>>>> @@ -4158,7 +4155,9 @@ struct label {
>>>>  struct st_translate {
>>>>     struct ureg_program *ureg;
>>>>
>>>> -   struct ureg_dst temps[MAX_TEMPS];
>>>> +   unsigned temps_size;
>>>> +   struct ureg_dst *temps;
>>>> +
>>>>     struct ureg_dst arrays[MAX_ARRAYS];
>>>>     struct ureg_src *constants;
>>>>     struct ureg_src *immediates;
>>>> @@ -4299,7 +4298,16 @@ dst_register(struct st_translate *t,
>>>>        return ureg_dst_undef();
>>>>
>>>>     case PROGRAM_TEMPORARY:
>>>> -      assert(index < Elements(t->temps));
>>>> +      /* Allocate space for temporaries on demand. */
>>>> +      if (index >= t->temps_size) {
>>>> +         const int inc = 4096;
>>>> +
>>>> +         t->temps = (struct ureg_dst*)
>>>> +                    realloc(t->temps,
>>>> +                            (t->temps_size + inc) * sizeof(struct ureg_dst));
>>>
>>> Check alloc?
>>>
>>> With that fixed,
>>>
>>> Reviewed-by: Ilia Mirkin <imirkin at alum.mit.edu>
>>>
>>> BTW, should this in any way be related to PIPE_SHADER_CAP_MAX_TEMPS?
>>> Looking at what drivers set this to, it's fairly small values. e.g.
>>> nvc0 returns 128. I guess there's limited spilling space. OTOH, it can
>>> do a *lot* better than TGSI at storing stuff due to optimizations,
>>> register reuse, etc.
>>
>> I think the limit only matters for ARB program queries. It should be
>> ignored by GLSL, because most drivers do register allocation anyway.
> 
> Correct.  The old ARB program limits existed
> (GL_MAX_PROGRAM_TEMPORARIES_ARB vs.
> GL_MAX_PROGRAM_NATIVE_TEMPORARIES_ARB) so that people could write simple
> parsers with statically allocated arrays.  For GLSL, the limits are
> irrelevant.

Just because the limits are irrelevant for glsl doesn't mean they are
necessarily so for gallium.
It looks though like _most_ drivers report some native limit but they
don't care if you use more temps. At least softpipe though _will_ hit
asserts if you give it more (llvmpipe used to do that too but should no
longer). Though I guess that this should be classified as a bug then.
FWIW tgsi also has a hard limit for temps itself due to encoding but
it's luckily higher (the src/dst regs have 16 index bits, and if you use
temp arrays that's per array).

Roland



More information about the mesa-dev mailing list