[Mesa-dev] [PATCH 1/2] i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET.

Matt Turner mattst88 at gmail.com
Fri Sep 25 11:21:10 PDT 2015


On Fri, Sep 25, 2015 at 11:10 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, September 25, 2015 11:03:46 AM Matt Turner wrote:
>> On Fri, Sep 25, 2015 at 10:47 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> > GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special
>> > strides.  We can easily make the generator handle constant src[0]
>> > arguments by instead generating a MOV with the product of both operands.
>> >
>> > This isn't necessarily a win in and of itself - instead of a MUL, we
>> > generate a MOV, which should be basically the same cost.  However, we
>> > can probably avoid the earlier MOV to put src[0] into a register.
>> >
>> > shader-db statistics for geometry shaders only:
>> >
>> > total instructions in shared programs: 3207 -> 3173 (-1.06%)
>> > instructions in affected programs:     3207 -> 3173 (-1.06%)
>> > helped:                                11
>> >
>> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++++++
>> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp        | 9 +++++++--
>> >  2 files changed, 14 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> > index 5b6444e..610caef 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> > @@ -202,6 +202,13 @@ try_constant_propagate(const struct brw_device_info *devinfo,
>> >          return true;
>> >        }
>> >        break;
>> > +   case GS_OPCODE_SET_WRITE_OFFSET:
>> > +      /* This is just a multiply by a constant with special strides.
>> > +       * The generator will handle immediates in both arguments (generating
>> > +       * a single MOV of the product).  So feel free to propagate in src0.
>> > +       */
>> > +      inst->src[arg] = value;
>> > +      return true;
>> >
>> >     case BRW_OPCODE_CMP:
>> >        if (arg == 1) {
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > index e69c067..620167d 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> > @@ -541,8 +541,13 @@ vec4_generator::generate_gs_set_write_offset(struct brw_reg dst,
>> >            src1.file == BRW_IMMEDIATE_VALUE &&
>> >            src1.type == BRW_REGISTER_TYPE_UD &&
>> >            src1.dw1.ud <= USHRT_MAX);
>> > -   brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4),
>> > -           retype(src1, BRW_REGISTER_TYPE_UW));
>> > +   if (src0.file == IMM) {
>> > +      brw_MOV(p, suboffset(stride(dst, 2, 2, 1), 3),
>> > +              brw_imm_ud(src0.dw1.ud * src1.dw1.ud));
>>
>> Alternatively, we could make opt_algebraic() constant-evaluate this at
>> a higher level. I'm not sure if that would help generate better code,
>> but it seems a little cleaner.
>
> I guess that's possible, but I'm not sure it would be cleaner.  We still
> need the crazy stride/suboffset on the resulting MOV.  Which I don't
> think we can represent in src_reg in general.  But since the destination
> is an MRF, we could do it at the brw_reg level.  But that seems ugly
> as well.

Ah, I missed that this was an align1 mul(2)/mov(2). Yeah, that's not
possible to handle in a higher level without adding another opcode.

Reviewed-by: Matt Turner <mattst88 at gmail.com>


More information about the mesa-dev mailing list