[Mesa-dev] [PATCH 1/2] i965/fs: Add/use functions to convert to 3src_align1 vstride/hstride
Scott D Phillips
scott.d.phillips at intel.com
Tue Jan 9 21:22:34 UTC 2018
Matt Turner <mattst88 at gmail.com> writes:
> On Mon, Jan 8, 2018 at 5:01 PM, Scott D Phillips
> <scott.d.phillips at intel.com> wrote:
>> Matt Turner <mattst88 at gmail.com> writes:
>>
>>> Some cases weren't handled, such as stride 4 which is needed for 64-bit
>>> operations. Presumably fixes the assertion failure mentioned in commit
>>> 2d0457203871 (Revert "i965/fs: Use align1 mode on ternary instructions
>>> on Gen10+") but who can really say since the commit neglected to list
>>> any of them!
>>> ---
>>> src/intel/compiler/brw_eu_emit.c | 69 ++++++++++++++++++++++++----------------
>>> 1 file changed, 41 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
>>> index 85bb6a4cdd..c25d8d6eda 100644
>>> --- a/src/intel/compiler/brw_eu_emit.c
>>> +++ b/src/intel/compiler/brw_eu_emit.c
>>> @@ -673,6 +673,42 @@ get_3src_subreg_nr(struct brw_reg reg)
>>> return reg.subnr / 4;
>>> }
>>>
>>> +static enum gen10_align1_3src_vertical_stride
>>> +to_3src_align1_vstride(enum brw_vertical_stride vstride)
>>> +{
>>> + switch (vstride) {
>>> + case BRW_VERTICAL_STRIDE_0:
>>> + return BRW_ALIGN1_3SRC_VERTICAL_STRIDE_0;
>>> + case BRW_VERTICAL_STRIDE_2:
>>> + return BRW_ALIGN1_3SRC_VERTICAL_STRIDE_2;
>>> + case BRW_VERTICAL_STRIDE_4:
>>> + return BRW_ALIGN1_3SRC_VERTICAL_STRIDE_4;
>>> + case BRW_VERTICAL_STRIDE_8:
>>> + case BRW_VERTICAL_STRIDE_16:
>>> + return BRW_ALIGN1_3SRC_VERTICAL_STRIDE_8;
>>
>> What is the reasoning for vstride 16 to map to 8 here? Could that cause
>> problems?
>
> Good question. In an exec_size 16 instruction, a 16,16,1 region
> actually reads the same channels in the same order as an 8,8,1 region
> (another confusing thing about regioning is that there are effectively
> duplicates...). That's the most common region, and I seem to recall
> that in some cases the IR contains instructions with a 16,16,1 region.
> Other than in that duplicate case, I don't think we ever use vstride
> 16. As a result, they left it out of the hardware for align1 ternary
> instructions.
>
> If I can get some hardware to test with, it's probably a good idea for
> me to to double check which cases cause us to need to handle vstride
> 16 here. Maybe an assertion that it is the "duplicate" 16,16,1 region
> should be added.
Got it, with the assert or the double check on your todo list, patch is
Reviewed-by: Scott D Phillips <scott.d.phillips at intel.com>
More information about the mesa-dev
mailing list