[Mesa-dev] [PATCH 12/17] intel/compiler/fs: Implement ddy without using align16 for Gen11+

Matt Turner mattst88 at gmail.com
Mon Feb 26 19:39:00 UTC 2018


On Fri, Feb 23, 2018 at 4:42 PM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Tuesday, February 20, 2018 9:15:19 PM PST Matt Turner wrote:
>> Align16 is no more. We previously generated an align16 ADD instruction
>> to calculate DDY:
>>
>>    add(8) g11<1>F  -g10<4>.xyxyF  g10<4>.zwzwF  { align16 1Q };
>>
>> Without align16, we now implement it as two align1 instructions:
>>
>>    add(4) g11<2>F   -g10<4,2,0>F    g10.2<4,2,0>F  { align1 1N };
>>    add(4) g11.1<2>F -g10.1<4,2,0>F  g10.3<4,2,0>F  { align1 1N };
>> ---
>>  src/intel/compiler/brw_fs_generator.cpp | 70 ++++++++++++++++++++++++++-------
>>  1 file changed, 56 insertions(+), 14 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp
>> index 013d2c820a0..ffc46972420 100644
>> --- a/src/intel/compiler/brw_fs_generator.cpp
>> +++ b/src/intel/compiler/brw_fs_generator.cpp
>> @@ -1192,23 +1192,65 @@ fs_generator::generate_ddy(const fs_inst *inst,
>>  {
>>     if (inst->opcode == FS_OPCODE_DDY_FINE) {
>>        /* produce accurate derivatives */
>> -      struct brw_reg src0 = src;
>> -      struct brw_reg src1 = src;
>> +      if (devinfo->gen >= 11) {
>> +         struct brw_reg x = src;
>> +         struct brw_reg y = src;
>> +         struct brw_reg z = src;
>> +         struct brw_reg w = src;
>> +         struct brw_reg dst_e = dst;
>> +         struct brw_reg dst_o = dst;
>
> Maybe call these dst_even and dst_odd?
> Or perhaps dst_e = dst /* even channels */?
>
>> +
>> +         x.vstride = BRW_VERTICAL_STRIDE_4;
>> +         y.vstride = BRW_VERTICAL_STRIDE_4;
>> +         z.vstride = BRW_VERTICAL_STRIDE_4;
>> +         w.vstride = BRW_VERTICAL_STRIDE_4;
>> +
>> +         x.width = BRW_WIDTH_2;
>> +         y.width = BRW_WIDTH_2;
>> +         z.width = BRW_WIDTH_2;
>> +         w.width = BRW_WIDTH_2;
>> +
>> +         x.hstride = BRW_HORIZONTAL_STRIDE_0;
>> +         y.hstride = BRW_HORIZONTAL_STRIDE_0;
>> +         z.hstride = BRW_HORIZONTAL_STRIDE_0;
>> +         w.hstride = BRW_HORIZONTAL_STRIDE_0;
>
> If you like, you could drop some wordiness by doing:
>
>       struct brw_reg x = stride(src, 4, 2, 0);
>       struct brw_reg y = stride(src, 4, 2, 0);
>       struct brw_reg z = stride(src, 4, 2, 0);
>       struct brw_reg w = stride(src, 4, 2, 0);
>
>> +
>> +         x.subnr = 0 * sizeof(float);
>> +         y.subnr = 1 * sizeof(float);
>> +         z.subnr = 2 * sizeof(float);
>> +         w.subnr = 3 * sizeof(float);
>> +
>
> With or without any suggestions, this patch is:
> Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

More good suggestions. Thanks!


More information about the mesa-dev mailing list