[Mesa-dev] [PATCH] gallivm: fix improper clamping of vertex index when fetching gs inputs

Roland Scheidegger sroland at vmware.com
Thu Nov 8 16:48:44 UTC 2018


Am 08.11.18 um 14:59 schrieb Jose Fonseca:
> Good find.
> 
> On 08/11/2018 01:54, sroland at vmware.com wrote:
>> From: Roland Scheidegger <sroland at vmware.com>
>>
>> Because we only have one file_max for the (2d) gs input file, the value
>> actually represents the max of attrib and vertex index (although I'm
>> not entirely sure if we really want the max, since the max valid value
>> of the vertex dimension can be easily deduced from the input primitive).
> 
> Yes, perhaps we should have a 2nd file_max array for the 2nd axis.   But
> it would be a more invasive change.
If just for gs inputs, I can't see why we'd need it, since it can easily
be deduced from the input prim (we also have file_count, although that
seems just plain broken to me for gs inputs, as well as file_mask, which
also mixes vertex / attrib indexes, and looks useless in that form to me
as well).

> 
>> Thus in cases where the number of inputs is higher than the number of
>> vertices per prim, we did not properly clamp the vertex index, which
>> would result in out-of-bound fetches, potentially causing segfaults
>> (the segfaults seemed actually difficult to trigger, but valgrind
>> certainly wasn't happy). This might have happened even if the shader
>> did not actually try to fetch bogus vertices, if the fetching happened
>> in non-active conditional clauses.
>>
>> To fix simply use the correct max vertex index value (derived from
>> the input prim type) instead when clamping for this case.
>> ---
>>   .../auxiliary/gallivm/lp_bld_tgsi_soa.c       | 38 ++++++++++++++-----
>>   1 file changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> index 83d7dbea9a..0db81b31ad 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c
>> @@ -41,6 +41,7 @@
>>   #include "util/u_debug.h"
>>   #include "util/u_math.h"
>>   #include "util/u_memory.h"
>> +#include "util/u_prim.h"
>>   #include "tgsi/tgsi_dump.h"
>>   #include "tgsi/tgsi_exec.h"
>>   #include "tgsi/tgsi_info.h"
>> @@ -1059,7 +1060,8 @@ emit_mask_scatter(struct
>> lp_build_tgsi_soa_context *bld,
>>   static LLVMValueRef
>>   get_indirect_index(struct lp_build_tgsi_soa_context *bld,
>>                      unsigned reg_file, unsigned reg_index,
>> -                   const struct tgsi_ind_register *indirect_reg)
>> +                   const struct tgsi_ind_register *indirect_reg,
>> +                   unsigned index_limit)
> 
> file_max array is signed, so let's make index_limit also signed, and
> assert here in this function that it is positive.  This would trap if
> some caller gets the limit for the wrong file.
> 
> Which reminds me -- SM4 allows to declare constant buffers with
> undeclared size.  What exactly would we get passed here in those
> circunstances?  In practice constant buffers can't be larger than 64KB,
> which means we could presume filemax is 4K registers.
For constant reg file, the clamping is skipped in this function here.
emit_fetch_constant does its own clamping, based on the actual size of
the bound buffers. (Although note the clamping is only done for indirect
constant buffer fetches, not for direct ones, which may not be safe.
Maybe should fix that as well? I think it was omitted for performance
reasons, and because unlike for indirect fetches the shader would be
somewhat bogus, but I guess it would still result in out-of-bound fetches.)


> 
>>   {
>>      LLVMBuilderRef builder = bld->bld_base.base.gallivm->builder;
>>      struct lp_build_context *uint_bld = &bld->bld_base.uint_bld;
>> @@ -1107,8 +1109,7 @@ get_indirect_index(struct
>> lp_build_tgsi_soa_context *bld,
>>       */
>>      if (reg_file != TGSI_FILE_CONSTANT) {
>>         max_index = lp_build_const_int_vec(bld->bld_base.base.gallivm,
>> -                                         uint_bld->type,
>> -                                        
>> bld->bld_base.info->file_max[reg_file]);
>> +                                         uint_bld->type, index_limit);
>>           assert(!uint_bld->type.sign);
>>         index = lp_build_min(uint_bld, index, max_index);
>> @@ -1224,7 +1225,8 @@ emit_fetch_constant(
>>         indirect_index = get_indirect_index(bld,
>>                                             reg->Register.File,
>>                                             reg->Register.Index,
>> -                                          &reg->Indirect);
>> +                                          &reg->Indirect,
>> +                                         
>> bld->bld_base.info->file_max[reg->Register.File]);
>>           /* All fetches are from the same constant buffer, so
>>          * we need to propagate the size to a vector to do a
>> @@ -1341,7 +1343,8 @@ emit_fetch_immediate(
>>            indirect_index = get_indirect_index(bld,
>>                                                reg->Register.File,
>>                                                reg->Register.Index,
>> -                                             &reg->Indirect);
>> +                                             &reg->Indirect,
>> +                                            
>> bld->bld_base.info->file_max[reg->Register.File]);
>>            /*
>>             * Unlike for other reg classes, adding pixel offsets is
>> unnecessary -
>>             * immediates are stored as full vectors (FIXME??? - might
>> be better
>> @@ -1414,7 +1417,8 @@ emit_fetch_input(
>>         indirect_index = get_indirect_index(bld,
>>                                             reg->Register.File,
>>                                             reg->Register.Index,
>> -                                          &reg->Indirect);
>> +                                          &reg->Indirect,
>> +                                         
>> bld->bld_base.info->file_max[reg->Register.File]);
>>           index_vec = get_soa_array_offsets(&bld_base->uint_bld,
>>                                           indirect_index,
>> @@ -1502,7 +1506,15 @@ emit_fetch_gs_input(
>>         attrib_index = get_indirect_index(bld,
>>                                           reg->Register.File,
>>                                           reg->Register.Index,
>> -                                        &reg->Indirect);
>> +                                        &reg->Indirect,
>> +   /*
>> +    * XXX: this is possibly not quite the right value, since file_max
>> may be
>> +    * larger than the max attrib index, due to it being the max of
>> declared
>> +    * inputs AND the max vertices per prim (which is 6 for tri adj).
>> +    * It should however be safe to use (since we always allocate
>> +    * PIPE_MAX_SHADER_INPUTS (80) for it, which is overallocated
>> quite a bit).
>> +    */
>> +                                       
>> info->file_max[reg->Register.File]);
>>      } else {
>>         attrib_index = lp_build_const_int32(gallivm,
>> reg->Register.Index);
>>      }
>> @@ -1511,7 +1523,11 @@ emit_fetch_gs_input(
>>         vertex_index = get_indirect_index(bld,
>>                                           reg->Register.File,
>>                                           reg->Dimension.Index,
>> -                                        &reg->DimIndirect);
>> +                                        &reg->DimIndirect,
>> +   /*
>> +    * A fixed 6 should do as well (which is what we allocate).
>> +    */
> 
> This mid-call comment makes code hard to read.  let's just put the
> `u_vertices_per_prim(info->properties[TGSI_PROPERTY_GS_INPUT_PRIM])` in
> a local variable before get_indirect_index call and comment there.
Ok, fixed.

Roland

> 
>> +                                       
>> u_vertices_per_prim(info->properties[TGSI_PROPERTY_GS_INPUT_PRIM]));
>>      } else {
>>         vertex_index = lp_build_const_int32(gallivm,
>> reg->Dimension.Index);
>>      }
>> @@ -1565,7 +1581,8 @@ emit_fetch_temporary(
>>         indirect_index = get_indirect_index(bld,
>>                                             reg->Register.File,
>>                                             reg->Register.Index,
>> -                                          &reg->Indirect);
>> +                                          &reg->Indirect,
>> +                                         
>> bld->bld_base.info->file_max[reg->Register.File]);
>>           index_vec = get_soa_array_offsets(&bld_base->uint_bld,
>>                                           indirect_index,
>> @@ -1785,7 +1802,8 @@ emit_store_chan(
>>         indirect_index = get_indirect_index(bld,
>>                                             reg->Register.File,
>>                                             reg->Register.Index,
>> -                                          &reg->Indirect);
>> +                                          &reg->Indirect,
>> +                                         
>> bld->bld_base.info->file_max[reg->Register.File]);
>>      } else {
>>         assert(reg->Register.Index <=
>>                               
>> bld_base->info->file_max[reg->Register.File]);
>>
> 
> Otherwise looks great.
> 
> Reviewed-by: Jose Fonseca <jfonseca at vmware.com>



More information about the mesa-dev mailing list