[Mesa-dev] [PATCH 2/5] intel/fs: Lower integer multiply correctly when destination stride equals 4.

Francisco Jerez currojerez at riseup.net
Fri Feb 15 03:59:08 UTC 2019


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <currojerez at riseup.net>
> wrote:
>
>> Because the "low" temporary needs to be accessed with word type and
>> twice the original stride, attempting to preserve the alignment of the
>> original destination can potentially lead to instructions with illegal
>> destination stride greater than four.  Because the CHV/BXT alignment
>> restrictions are now being enforced by the regioning lowering pass run
>> after lower_integer_multiplication(), there is no real need to
>> preserve the original strides anymore.
>>
>> Note that this bug can be reproduced on stable branches, but
>> back-porting would be non-trivial, because the fix relies on the
>> regioning lowering pass recently introduced.
>> ---
>>  src/intel/compiler/brw_fs.cpp | 6 ++----
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
>> index f475b617df2..5768e0d6542 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -3962,13 +3962,11 @@ fs_visitor::lower_integer_multiplication()
>>                  regions_overlap(inst->dst, inst->size_written,
>>                                  inst->src[0], inst->size_read(0)) ||
>>                  regions_overlap(inst->dst, inst->size_written,
>> -                                inst->src[1], inst->size_read(1))) {
>> +                                inst->src[1], inst->size_read(1)) ||
>> +                inst->dst.stride >= 4) {
>>
>
> It would be nice to throw in a quick comment as to why we're adding a
> temporary when stride >= 4.
>

There seemed to be no pre-existing comment about any of the other
conditions to allocate a temporary, so I've added the following:

+ /* Get a new VGRF for the "low" 32x16-bit multiplication result if
+  * reusing the original destination is impossible due to hardware
+  * restrictions, source/destination overlap, or it being the null
+  * register.
+  */
                                      
>
>>                 needs_mov = true;
>> -               /* Get a new VGRF but keep the same stride as inst->dst */
>>                 low = fs_reg(VGRF, alloc.allocate(regs_written(inst)),
>>                              inst->dst.type);
>> -               low.stride = inst->dst.stride;
>> -               low.offset = inst->dst.offset % REG_SIZE;
>>              }
>>
>>              /* Get a new VGRF but keep the same stride as inst->dst */
>> --
>> 2.19.2
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190214/1f077738/attachment-0001.sig>


More information about the mesa-dev mailing list