[Mesa-dev] [PATCH 3/5] intel/fs: Cap dst-aligned region stride to maximum representable hstride value.

Francisco Jerez currojerez at riseup.net
Thu Feb 14 22:53:48 UTC 2019


Jason Ekstrand <jason at jlekstrand.net> writes:

> On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez <currojerez at riseup.net>
> wrote:
>
>> This is required in combination with the following commit, because
>> otherwise if a source region with an extended 8+ stride is present in
>> the instruction (which we're about to declare legal) we'll end up
>> emitting code that attempts to write to such a region, even though
>> strides greater than four are still illegal for the destination.
>> ---
>>  src/intel/compiler/brw_fs_lower_regioning.cpp | 20 ++++++++++++++-----
>>  1 file changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp
>> b/src/intel/compiler/brw_fs_lower_regioning.cpp
>> index 6a3c23892b4..b86e95ed9eb 100644
>> --- a/src/intel/compiler/brw_fs_lower_regioning.cpp
>> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp
>> @@ -71,15 +71,25 @@ namespace {
>>            !is_byte_raw_mov(inst)) {
>>           return get_exec_type_size(inst);
>>        } else {
>> -         unsigned stride = inst->dst.stride * type_sz(inst->dst.type);
>> +         /* Calculate the maximum byte stride and the minimum type size
>> across
>> +          * all source and destination operands.
>> +          */
>> +         unsigned max_stride = inst->dst.stride * type_sz(inst->dst.type);
>> +         unsigned min_size = type_sz(inst->dst.type);
>>
>>           for (unsigned i = 0; i < inst->sources; i++) {
>> -            if (!is_uniform(inst->src[i]) && !inst->is_control_source(i))
>> -               stride = MAX2(stride, inst->src[i].stride *
>> -                             type_sz(inst->src[i].type));
>> +            if (!is_uniform(inst->src[i]) && !inst->is_control_source(i))
>> {
>> +               max_stride = MAX2(max_stride, inst->src[i].stride *
>> +                                 type_sz(inst->src[i].type));
>> +               min_size = MIN2(min_size, type_sz(inst->src[i].type));
>> +            }
>>           }
>>
>> -         return stride;
>> +         /* Attempt to use the largest byte stride among all present
>> operands,
>> +          * but never exceed a stride of 4 since that would lead to
>> illegal
>> +          * destination regions during lowering.
>> +          */
>> +         return MIN2(max_stride, 4 * min_size);
>>
>
> Why not just fall back to tightly packed in this case?  I think I can
> answer my own question:  Because using something that's equal to one of the
> strides reduces the liklihood that we'll need a temporary.  If that's the
> correct answer, then maybe what we want is the maximum of all strides with
> stride_in_bytes <= 4 * type_sz?
>

We also want the result to be greater than or equal to the size of the
largest non-uniform, non-control source type, since packing a vector of
such a type into a temporary of lower byte stride than its size is
impossible.  This patch guarantees that as long as max_size <= 4 *
min_size, which is necessary for the lowering code that calls this
function to work at all.

It would be possible to preserve this guarantee while attempting to pick
one of the strides of the pre-existing sources as you say -- I would be
happy to review that change as a follow-up micro-optimization patch, but
there are some corner cases to consider I don't necessarily want to
bother with in the patch doing the functional change, for the sake of
bisectability.

It may make sense to add an assert here that max_size <= 4 * min_size
for the case such an instruction doesn't blow up already at the EU
validator (it doesn't look like the validator is currently enforcing the
lack of conversions between 8 and 64 bit types?), it will just involve
calculating max_size in addition to max_stride and min_size above.

> --Jason
>
>
>>        }
>>     }
>>
>> --
>> 2.19.2
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 227 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20190214/1141cc6d/attachment-0001.sig>


More information about the mesa-dev mailing list