[Mesa-dev] [PATCH 05/20] i965/fs: Handle explicit flag destinations in flags_written()

Matt Turner mattst88 at gmail.com
Mon Jul 17 17:57:03 UTC 2017


On Fri, Jul 14, 2017 at 3:23 PM, Francisco Jerez <currojerez at riseup.net> wrote:
> Matt Turner <mattst88 at gmail.com> writes:
>
>> The implementations of the ARB_shader_group_vote intrinsics will
>> explicitly write the flag as the destination register.
>> ---
>>  src/intel/compiler/brw_fs.cpp | 12 ++++++++++--
>>  1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
>> index 43b6e34204..97908a4563 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -890,9 +890,17 @@ fs_inst::flags_written() const
>>                              opcode != BRW_OPCODE_WHILE)) ||
>>         opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
>>        return flag_mask(this);
>> -   } else {
>> -      return 0;
>> +   } else if (dst.file == ARF) {
>> +      if (dst.nr == BRW_ARF_FLAG + 0 && dst.subnr == 0)
>> +         return 0b0001;
>> +      if (dst.nr == BRW_ARF_FLAG + 0 && dst.subnr == 1)
>> +         return 0b0010;
>> +      if (dst.nr == BRW_ARF_FLAG + 1 && dst.subnr == 0)
>> +         return 0b0100;
>> +      if (dst.nr == BRW_ARF_FLAG + 1 && dst.subnr == 1)
>> +         return 0b1000;
>
> This doesn't look right to me, it assumes that flag registers are 16b
> (they're 32b), and that you can only write a single byte of the flag
> register at a time, which is hardly ever the case.

I don't think I read the implementation of flag_mask(), just the
comment, and thought that it returned a bit per flag subregister.

>
> I think you want something along the lines of:
>
> | unsigned
> | bit_mask(unsigned n)
> | {
> |    return (n >= CHAR_BIT * sizeof(bit_mask(n)) ? ~0u : (1 << n) - 1);
> | }
> |
> | unsigned
> | flag_mask(const fs_reg &r, unsigned sz)
> | {
> |    if (r.file == ARF) {
> |       const unsigned start = (r.nr - BRW_ARF_FLAG) * 4 + r.subnr;
> |       const unsigned end = start + sz;
> |       return bit_mask(end) & ~bit_mask(start);
> |    } else {
> |       return 0;
> |    }
> | }
>
> [Yes, this should give you the right result (i.e. zero) for ARF
> registers outside the range of the (flag) ARF window of the 32-bit
> bitmask.]  Notice the similarity between this and the current
> flag_mask() overload.  This will allow you to easily re-use the same
> logic for both flags_read() and flags_written(), you just need to pass
> 'inst->src[i], inst->size_read(i)' and 'inst->dst, inst->size_written'
> as arguments respectively, ORing the results for each source in the
> former case.

Wow, thanks. That's fantastic. I've made this change (and also changed
the authorship on 05/20 and 12/20 to you! :)


More information about the mesa-dev mailing list