[Mesa-dev] [PATCH 01/37] i965/gs: Use single dispatch mode as fallback to dual object mode when possible.

Iago Toral Quiroga itoral at igalia.com
Wed Sep 3 23:28:36 PDT 2014


On mié, 2014-09-03 at 17:49 -0700, Jordan Justen wrote:
> On Thu, Aug 14, 2014 at 4:11 AM, Iago Toral Quiroga <itoral at igalia.com> wrote:
> > Currently, when a geometry shader can't use dual object mode we fall back to
> > dual instance mode, however, when invocations == 1, single dispatch mode is
> > more performant and equally efficient in terms of register pressure.
> >
> > Single dispatch mode requires that the driver can handle interleaving of
> > registers, but this is already supported (dual instance mode has the same
> > requirement).
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h           |  8 ++++---
> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 26 +++++++++++------------
> >  src/mesa/drivers/dri/i965/gen7_gs_state.c         |  4 +---
> >  src/mesa/drivers/dri/i965/gen8_gs_state.c         |  4 +---
> >  4 files changed, 20 insertions(+), 22 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> > index 1bbcf46..7439da1 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -587,10 +587,12 @@ struct brw_gs_prog_data
> >     int invocations;
> >
> >     /**
> > -    * True if the thread should be dispatched in DUAL_INSTANCE mode, false if
> > -    * it should be dispatched in DUAL_OBJECT mode.
> > +    * Dispatch mode, can be any of:
> > +    * GEN7_GS_DISPATCH_MODE_DUAL_OBJECT
> > +    * GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE
> > +    * GEN7_GS_DISPATCH_MODE_SINGLE
> >      */
> > -   bool dual_instanced_dispatch;
> > +   int dispatch_mode;
> >  };
> >
> >  /** Number of texture sampler units */
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > index b7995ad..c2a4892 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
> > @@ -101,10 +101,11 @@ vec4_gs_visitor::setup_payload()
> >  {
> >     int attribute_map[BRW_VARYING_SLOT_COUNT * MAX_GS_INPUT_VERTICES];
> >
> > -   /* If we are in dual instanced mode, then attributes are going to be
> > -    * interleaved, so one register contains two attribute slots.
> > +   /* If we are in dual instanced or single mode, then attributes are going
> > +    * to be interleaved, so one register contains two attribute slots.
> >      */
> > -   int attributes_per_reg = c->prog_data.dual_instanced_dispatch ? 2 : 1;
> > +   int attributes_per_reg =
> > +      c->prog_data.dispatch_mode == GEN7_GS_DISPATCH_MODE_DUAL_OBJECT ? 1 : 2;
> >
> >     /* If a geometry shader tries to read from an input that wasn't written by
> >      * the vertex shader, that produces undefined results, but it shouldn't
> > @@ -129,8 +130,7 @@ vec4_gs_visitor::setup_payload()
> >
> >     reg = setup_varying_inputs(reg, attribute_map, attributes_per_reg);
> >
> > -   lower_attributes_to_hw_regs(attribute_map,
> > -                               c->prog_data.dual_instanced_dispatch);
> > +   lower_attributes_to_hw_regs(attribute_map, attributes_per_reg > 1);
> >
> >     this->first_non_payload_grf = reg;
> >  }
> > @@ -640,7 +640,7 @@ brw_gs_emit(struct brw_context *brw,
> >      */
> >     if (c->prog_data.invocations <= 1 &&
> >         likely(!(INTEL_DEBUG & DEBUG_NO_DUAL_OBJECT_GS))) {
> > -      c->prog_data.dual_instanced_dispatch = false;
> > +      c->prog_data.dispatch_mode = GEN7_GS_DISPATCH_MODE_DUAL_OBJECT;
> >
> >        vec4_gs_visitor v(brw, c, prog, mem_ctx, true /* no_spills */);
> >        if (v.run()) {
> > @@ -652,15 +652,15 @@ brw_gs_emit(struct brw_context *brw,
> >
> >     /* Either we failed to compile in DUAL_OBJECT mode (probably because it
> >      * would have required spilling) or DUAL_OBJECT mode is disabled.  So fall
> > -    * back to DUAL_INSTANCED mode, which consumes fewer registers.
> > +    * back to DUAL_INSTANCED or SINGLE mode, which consumes fewer registers.
> >      *
> > -    * FIXME: In an ideal world we'd fall back to SINGLE mode, which would
> > -    * allow us to interleave general purpose registers (resulting in even less
> > -    * likelihood of spilling).  But at the moment, the vec4 generator and
> > -    * visitor classes don't have the infrastructure to interleave general
> > -    * purpose registers, so DUAL_INSTANCED is the best we can do.
> 
> I think Paul's note is still somewhat valid. Updates to the code gen
> could lead to improved register usage with single mode, right?
> 
> So, we'll use single mode, but with the same register usage as dual instance.

Oh, I got confused with this because I thought he was talking about the
inputs for some reason, which are interleaved in both cases. I don't
know why I got confused, since we clearly mentions the generator
classes. But I see now that he is talking about the output.

I'll keep the FIXME part of the comment.

> > +    * SINGLE mode is more performant when invocations == 1 and DUAL_INSTANCE
> > +    * mode is more performant when invocations > 1.
> 
> Maybe site the PRM?
> 
> From the Ivy Bridge PRM, Vol2 Part1 7.2.1.1 "3DSTATE_GS"
> "When InstanceCount=1 (one instance per object) software can decide
> which dispatch mode to use. DUAL_OBJECT mode would likely be the best
> choice for performance, followed by SINGLE mode."

Sure, I all add the reference.

So with these two changes, should I assume that the patch is reviewed by
you?

Iago

> -Jordan
> 
> >      */
> > -   c->prog_data.dual_instanced_dispatch = true;
> > +   if (c->prog_data.invocations <= 1)
> > +      c->prog_data.dispatch_mode = GEN7_GS_DISPATCH_MODE_SINGLE;
> > +   else
> > +      c->prog_data.dispatch_mode = GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE;
> >
> >     vec4_gs_visitor v(brw, c, prog, mem_ctx, false /* no_spills */);
> >     if (!v.run()) {
> > diff --git a/src/mesa/drivers/dri/i965/gen7_gs_state.c b/src/mesa/drivers/dri/i965/gen7_gs_state.c
> > index 93f48f6..b3b4ee6 100644
> > --- a/src/mesa/drivers/dri/i965/gen7_gs_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen7_gs_state.c
> > @@ -145,9 +145,7 @@ upload_gs_state(struct brw_context *brw)
> >            GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT) |
> >           ((brw->gs.prog_data->invocations - 1) <<
> >            GEN7_GS_INSTANCE_CONTROL_SHIFT) |
> > -         (brw->gs.prog_data->dual_instanced_dispatch ?
> > -          GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE :
> > -          GEN7_GS_DISPATCH_MODE_DUAL_OBJECT) |
> > +         brw->gs.prog_data->dispatch_mode |
> >           GEN6_GS_STATISTICS_ENABLE |
> >           (brw->gs.prog_data->include_primitive_id ?
> >            GEN7_GS_INCLUDE_PRIMITIVE_ID : 0) |
> > diff --git a/src/mesa/drivers/dri/i965/gen8_gs_state.c b/src/mesa/drivers/dri/i965/gen8_gs_state.c
> > index 446edec..d8debbd 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_gs_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_gs_state.c
> > @@ -83,9 +83,7 @@ gen8_upload_gs_state(struct brw_context *brw)
> >        OUT_BATCH(((brw->max_gs_threads / 2 - 1) << HSW_GS_MAX_THREADS_SHIFT) |
> >                  (brw->gs.prog_data->control_data_header_size_hwords <<
> >                   GEN7_GS_CONTROL_DATA_HEADER_SIZE_SHIFT) |
> > -                (brw->gs.prog_data->dual_instanced_dispatch ?
> > -                 GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE :
> > -                 GEN7_GS_DISPATCH_MODE_DUAL_OBJECT) |
> > +                brw->gs.prog_data->dispatch_mode |
> >                  GEN6_GS_STATISTICS_ENABLE |
> >                  (brw->gs.prog_data->include_primitive_id ?
> >                   GEN7_GS_INCLUDE_PRIMITIVE_ID : 0) |
> > --
> > 1.9.1
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 




More information about the mesa-dev mailing list