[Mesa-dev] [PATCH] tgsi/scan: use wrap-around shift behavior explicitly for file_mask
Jose Fonseca
jfonseca at vmware.com
Fri Mar 2 07:41:35 UTC 2018
On 02/03/18 02:23, Brian Paul wrote:
> On 03/01/2018 07:01 PM, sroland at vmware.com wrote:
>> From: Roland Scheidegger <sroland at vmware.com>
>>
>> The comment said it will only represent the lowest 32 regs. This was
>> not entirely true in practice, since at least on x86 you'll get
>> masked shifts (unless the compiler could recognize it already and toss
>> it out). It turns out this actually works out alright (presumably
>> noone uses it for temp regs) when increasing max sampler views, so
>> make that behavior explicit.
>> Albeit it feels a bit hacky (but in any case, explicit behavior there
>> is better than undefined behavior).
This effectively treats the file_mask as an simplistic bloom filter. I
suppose that as long as all users of this for sampler views are aware of
this (and don't assume that if a shader with a single SRV on slot 32
also uses slot 0) it might work.
>> ---
>> src/gallium/auxiliary/tgsi/tgsi_scan.c | 7 +++++--
>> src/gallium/drivers/llvmpipe/lp_state_fs.c | 7 ++++++-
>> src/gallium/drivers/swr/swr_shader.cpp | 2 +-
>> 3 files changed, 12 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> index c35eff2..0d229c9 100644
>> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
>> @@ -585,8 +585,11 @@ scan_declaration(struct tgsi_shader_info *info,
>> int buffer;
>> unsigned index, target, type;
>> - /* only first 32 regs will appear in this bitfield */
>> - info->file_mask[file] |= (1 << reg);
>> + /*
>> + * only first 32 regs will appear in this bitfield, if larger
>> + * bits will wrap around.
>> + */
>> + info->file_mask[file] |= (1 << (reg & 31));
>
> Or, reg % 32 and let the compiler optimize it.
>
> Either way, Reviewed-by: Brian Paul <brianp at vmware.com>
I also admit that somehow reg & 31 made me pause, but reg % 32 seems
obviously right.
Reviewed-by: Jose Fonseca <jfonseca at vmware.com>
Jose
More information about the mesa-dev
mailing list