[Mesa-dev] [PATCH 4/5] intel/fs: Implement extended strides greater than 4 for IR source regions.

Francisco Jerez currojerez at riseup.net
Thu Feb 14 21:50:54 UTC 2019


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <currojerez at riseup.net>
> wrote:
>
>> Strides up to 32B can be implemented for the source regions of most
>> instructions by leveraging either the vertical or the horizontal
>> stride of the hardware Align1 region.  The main motivation for this is
>> that currently the lower_integer_multiplication() pass will happily
>> double the stride of one of the 32-bit sources, which can blow up if
>> the stride of the original source was already the maximum value
>> allowed by the hardware.
>>
>
> I thought this looked familiar so I did some digging...
>
> On Nov 2 of 2017, I wrote almost exactly this same patch which was
> committed on Nov 7 as e8c9e65185de3e821e1
> On Nov 14, Matt reverted it in a31d0382084c8aa8 because it wasn't needed
> anymore and he wasn't sure of its correctness.
>

That's funny, I wasn't aware of e8c9e65185de3e821e1 nor of its revert.
Change was certainly still needed on Nov 14 due to
lower_integer_multiplication().

> And here we are again....
>
> I still believe it to be correct so it is
>
> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
>
> My one major request is that you include some of the history of this change
> in the commit message.  As far as the patch itself goes, it's identical to
> mine except for the unneeded whitespace change and one additional assert
> which I believe to be a good addition.
>

Added a comment locally about your previous attempt to do the same thing.

> I've also CC'd matt in case he wants to throw in his $.02
>
> --Jason
>
> An alternative would be to use the regioning legalization pass in
>> order to lower such strides into the composition of multiple legal
>> strides, but that would be somewhat less efficient.
>>
>> This showed up as a regression from my commit cbea91eb57a501bebb1ca2
>> in Vulkan 1.1 CTS tests on CHV/BXT platforms, however it was really a
>> pre-existing problem that had affected conformance on other platforms
>> without native support for integer multiplication.  CHV/BXT were
>> getting around it because the code I removed in that commit had the
>> "fortunate" side effect of emitting narrower regions that didn't hit
>> the hardware stride limit after lowering.  Beyond fixing the
>> regression this fixes ~90 additional Vulkan 1.1 subgroup CTS tests on
>> ICL (that's why this patch is marked for inclusion in mesa-stable even
>> though the original regressing patch was not).
>>
>> Cc: mesa-stable at lists.freedesktop.org
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109328
>> Reported-by
>> <https://bugs.freedesktop.org/show_bug.cgi?id=109328Reported-by>: Mark
>> Janes <mark.a.janes at intel.com>
>> ---
>>  src/intel/compiler/brw_fs_generator.cpp | 14 +++++++++++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs_generator.cpp
>> b/src/intel/compiler/brw_fs_generator.cpp
>> index 5fc6cf5f8cc..b169eacf15b 100644
>> --- a/src/intel/compiler/brw_fs_generator.cpp
>> +++ b/src/intel/compiler/brw_fs_generator.cpp
>> @@ -90,9 +90,17 @@ brw_reg_from_fs_reg(const struct gen_device_info
>> *devinfo, fs_inst *inst,
>>            *       different execution size when the number of components
>>            *       written to each destination GRF is not the same.
>>            */
>> -         const unsigned width = MIN2(reg_width, phys_width);
>> -         brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr,
>> 0);
>> -         brw_reg = stride(brw_reg, width * reg->stride, width,
>> reg->stride);
>> +         if (reg->stride > 4) {
>> +            assert(reg != &inst->dst);
>> +            assert(reg->stride * type_sz(reg->type) <= REG_SIZE);
>> +            brw_reg = brw_vecn_reg(1, brw_file_from_reg(reg), reg->nr, 0);
>> +            brw_reg = stride(brw_reg, reg->stride, 1, 0);
>> +
>>
>
> Extra whitespace?
>
>
>> +         } else {
>> +            const unsigned width = MIN2(reg_width, phys_width);
>> +            brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg),
>> reg->nr, 0);
>> +            brw_reg = stride(brw_reg, width * reg->stride, width,
>> reg->stride);
>> +         }
>>
>>           if (devinfo->gen == 7 && !devinfo->is_haswell) {
>>              /* From the IvyBridge PRM (EU Changes by Processor
>> Generation, page 13):
>> --
>> 2.19.2
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190214/0519fdac/attachment.sig>


More information about the mesa-dev mailing list