[Mesa-dev] [PATCH 10/10] i965/fs: Combine pixel center calculation into one inst.

Matt Turner mattst88 at gmail.com
Fri Apr 17 11:58:45 PDT 2015


On Fri, Apr 17, 2015 at 11:56 AM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> On Tue, Apr 14, 2015 at 4:15 PM, Matt Turner <mattst88 at gmail.com> wrote:
>> The X and Y values come interleaved in g1 (.4-.11 inclusive), so we can
>> calculate them together with a single add(32) instruction on some
>> platforms like Broadwell and newer or in SIMD8 elsewhere.
>>
>> Note that I also moved the PIXEL_X/PIXEL_Y virtual opcodes from before
>> LINTERP to after it. That's because the writes_accumulator_implicitly()
>> function in backend_instruction tests for <= LINTERP for determining
>> whether the instruction indeed writes the accumulator implicitly. The
>> old FS_OPCODE_PIXEL_X/Y emitted ADD instructions, which did, but the new
>> opcodes just emit MOVs, which don't. It doesn't matter, since we don't
>> use these opcodes on Gen4/5 anymore, but in the case that we do...
>>
>> On Broadwell:
>> total instructions in shared programs: 7192355 -> 7186224 (-0.09%)
>> instructions in affected programs:     1190700 -> 1184569 (-0.51%)
>> helped:                                6131
>>
>> On Haswell:
>> total instructions in shared programs: 6155979 -> 6152800 (-0.05%)
>> instructions in affected programs:     652362 -> 649183 (-0.49%)
>> helped:                                3179
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h        |  2 +
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 ++++
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 71 ++++++++++++++++++--------
>>  3 files changed, 63 insertions(+), 20 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
>> index 1afa34e..f99da12 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -925,6 +925,8 @@ enum opcode {
>>     FS_OPCODE_DDY_FINE,
>>     FS_OPCODE_CINTERP,
>>     FS_OPCODE_LINTERP,
>> +   FS_OPCODE_PIXEL_X,
>> +   FS_OPCODE_PIXEL_Y,
>>     FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
>>     FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD_GEN7,
>>     FS_OPCODE_VARYING_PULL_CONSTANT_LOAD,
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index dba3286..7bc97a6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -1909,6 +1909,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width)
>>        case FS_OPCODE_LINTERP:
>>          generate_linterp(inst, dst, src);
>>          break;
>> +      case FS_OPCODE_PIXEL_X:
>> +         assert(src[0].type == BRW_REGISTER_TYPE_UW);
>> +         src[0].subnr = 0 * type_sz(src[0].type);
>> +         brw_MOV(p, dst, stride(src[0], 8, 4, 1));
>
> Why do we need the special opcode?  Can we not do that stride with
> what we already have in the FS compiler?

Nope, we only have horizontal stride in fs_reg (the 'stride' member).


More information about the mesa-dev mailing list