[Mesa-dev] [PATCH] tgsi/scan: use wrap-around shift behavior explicitly for file_mask

Roland Scheidegger sroland at vmware.com
Fri Mar 2 16:21:37 UTC 2018


Am 02.03.2018 um 08:41 schrieb Jose Fonseca:
> On 02/03/18 02:23, Brian Paul wrote:
>> On 03/01/2018 07:01 PM, sroland at vmware.com wrote:
>>> From: Roland Scheidegger <sroland at vmware.com>
>>>
>>> The comment said it will only represent the lowest 32 regs. This was
>>> not entirely true in practice, since at least on x86 you'll get
>>> masked shifts (unless the compiler could recognize it already and toss
>>> it out). It turns out this actually works out alright (presumably
>>> noone uses it for temp regs) when increasing max sampler views, so
>>> make that behavior explicit.
>>> Albeit it feels a bit hacky (but in any case, explicit behavior there
>>> is better than undefined behavior).
> 
> This effectively treats the file_mask as an simplistic bloom filter.  I
> suppose that as long as all users of this for sampler views are aware of
> this (and don't assume that if a shader with a single SRV on slot 32
> also uses slot 0) it might work.
Yes, I was a bit torn if I should make an explicit mask_srv variable
(using 2x64bit values).
Albeit in this case I would suggest still applying this patch, just to
make the undefined behavior there go away (as it definitely could be
different on some archs - even x86 sse does not mask shift counts...).
Albeit in this case could alternatively just skip the assignment if
count is larger than 31.


> 
>>> ---
>>>   src/gallium/auxiliary/tgsi/tgsi_scan.c     | 7 +++++--
>>>   src/gallium/drivers/llvmpipe/lp_state_fs.c | 7 ++++++-
>>>   src/gallium/drivers/swr/swr_shader.cpp     | 2 +-
>>>   3 files changed, 12 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c
>>> b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>>> index c35eff2..0d229c9 100644
>>> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>>> @@ -585,8 +585,11 @@ scan_declaration(struct tgsi_shader_info *info,
>>>         int buffer;
>>>         unsigned index, target, type;
>>> -      /* only first 32 regs will appear in this bitfield */
>>> -      info->file_mask[file] |= (1 << reg);
>>> +      /*
>>> +       * only first 32 regs will appear in this bitfield, if larger
>>> +       * bits will wrap around.
>>> +       */
>>> +      info->file_mask[file] |= (1 << (reg & 31));
>>
>> Or, reg % 32 and let the compiler optimize it.
>>
>> Either way, Reviewed-by: Brian Paul <brianp at vmware.com>
> 
> I also admit that somehow reg & 31 made me pause, but reg % 32 seems
> obviously right.
I dunno, for me it's the opposite. I'm thinking "masked shift count",
which naturally is a bitmask. Maybe I'm thinking in assembly, what's
that weird remainder operation there :-).
FWIW there's also semi-practical reasons for using bitmasks explicitly -
should the value in question be defined as a signed number for some
reason (maybe just due to lazyness) then the compiler can't optimize it
that easily. Ok maybe it still would in this case (because negative
shift counts are undefined anyway, therfore it doesn't matter that the
remainder done with a "and" mask would produce incorrect numbers for
negative values).

Roland

> 
> 
> Reviewed-by: Jose Fonseca <jfonseca at vmware.com>
> 
> Jose



More information about the mesa-dev mailing list