[Mesa-dev] [PATCH v2 3/4] i965/state: Create separate dirty state bits for each pipeline

Tue Mar 17 02:22:51 PDT 2015

On Monday, March 16, 2015 10:21:31 PM Kristian Høgsberg wrote:
> On Wed, Mar 11, 2015 at 11:53 AM, Jordan Justen
> <jordan.l.justen at intel.com> wrote:
> > When uploading state for a pipeline, we will save changed state for
> > the other pipelines.
> >
> > Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> 
> In reviewing this again, I realize I'm not completely comfortable with
> how this is done. The overall approach is great, I'm happy to see the
> dual pipeline flags contained in brw_state_upload.c, but I think we
> need to keep brw_upload_pipeline_state() a no-op in case it fails.

Given that brw_upload_state() has always done:

   struct brw_state_flags *state = &brw->state.dirty;

   state->mesa |= brw->NewGLState;
   brw->NewGLState = 0;

   state->brw |= ctx->NewDriverState;
   ctx->NewDriverState = 0;

I think there's a pretty good precedence for having it update the
driver's notion of what's dirty.  But you can still run it several
times back to back with no issues.

> I can't think of a case where it might bite us, but currently, the
> expectation is that if it fails, nothing changes and you can call it
> again after flushing the batch.  This patch changes that in two ways:
> 1) we merge the pipeline brw->state.dirty into state and clear the
> pipeline state and 2) we copy the dirty state into the other pipeline.
> Suppose we call with the render pipeline and fail, flush the batch,
> then call with the compute pipeline and clear the dirty flags [1]. If
> we then go back and try to emit render state again, there are no
> longer any dirty bits and we fail to emit render state.

I'm pretty comfortable with this.  Your concerns seem to revolve around
implicitly switching pipelines in the middle of state upload.  I don't
believe that can happen (and we'd better keep it that way!).

You're always uploading state for some particular operation - drawing
(on the render pipe), or glDispatchCompute (on the compute pipe).  When
processing an operation, we first save the batch state (bytes used for
commands, number of relocations).  Then we upload state.

If we run out of batch space(*), or refer to more objects than we can
fit in the aperture, we roll back to that saved state and flush the
batch, executing everything queued except the most recent operation.

We then re-upload state for that single failed operation in a fresh
batch.  There's no pipeline switch here - when we retry that same
operation, it will dispatch to the same pipe as before.  Notably,
there are no _mesa_update_state calls, so we can never hit resolves.
(We did those at the very beginning, before even beginning to process
the operation at hand!)  It's a pretty tight loop - 1. upload state,
2. check aperture space, 3. flush, go back to step 1.

> This may be far-fetched and, as I said, I don't think we can hit this.

So we agree :)

> But I'd rather not break the contract that brw_upload_pipeline_state()
> doesn't change state in case of failure - I know I'd hate to debug
> that.
> We can just introduce a brw_clear_pipeline_dirty_bits() function
> that will be called on successful state upload and merge the state
> bits to the other pipeline and then clear the selected pipeline dirty
> flags as well as brw->state.dirty.  And we need to use a local copy of
> brw->state.dirty in brw_upload_pipeline_state() for merging in the
> pipeline dirty flags so we don't modify brw->state.dirty on error.
> 
> [1] I don't see how we'd get into that, but with fast clear resolve
> using meta we end up running the render pipeline at unexpected places.

But not here - and frankly, doing resolves in the middle of state upload
would be crazy.  Resolves inherently need to set render state that the
actual draw will want to set, too.  You have to serialize them.

Which is what we do today - we handle resolves when Mesa first asks us
to draw or dispatch a compute workload, before we ever start processing
the operation itself.  The resolve will space-check-and-retry...but it
will get done.  Only then do we try the next thing, which will
space-check-and-retry independently.

Thanks for thinking about this carefully - it's easy to mess up, and we
really need to get it right.  I think we have, though.

> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h      |  1 +
> >  src/mesa/drivers/dri/i965/brw_state_upload.c | 42 ++++++++++++++++++++++------
> >  2 files changed, 35 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> > index 91b4054..e693f50 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1101,6 +1101,7 @@ struct brw_context
> >     GLuint NewGLState;
> >     struct {
> >        struct brw_state_flags dirty;
> > +      struct brw_state_flags pipelines[BRW_NUM_PIPELINES];
> >     } state;
> >
> >     struct brw_cache cache;
> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > index 4f21002..55a9050 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > @@ -586,15 +586,16 @@ brw_upload_programs(struct brw_context *brw)
> >     brw_upload_wm_prog(brw);
> >  }
> >
> > -/***********************************************************************
> > - * Emit all state:
> > - */
> > -void brw_upload_render_state(struct brw_context *brw)
> > +static inline void
> > +brw_upload_pipeline_state(struct brw_context *brw,
> > +                          enum brw_pipeline pipeline)
> >  {
> >     struct gl_context *ctx = &brw->ctx;
> >     struct brw_state_flags *state = &brw->state.dirty;
> >     int i;
> >     static int dirty_count = 0;
> > +   struct brw_state_flags *pipeline_state =
> > +      &brw->state.pipelines[pipeline];
> >
> >     state->mesa |= brw->NewGLState;
> >     brw->NewGLState = 0;
> > @@ -633,6 +634,12 @@ void brw_upload_render_state(struct brw_context *brw)
> >        brw->state.dirty.brw |= BRW_NEW_NUM_SAMPLES;
> >     }
> >
> > +   if ((pipeline_state->mesa | pipeline_state->brw) != 0) {
> > +      state->mesa |= pipeline_state->mesa;
> > +      state->brw |= pipeline_state->brw;
> > +      memset(pipeline_state, 0, sizeof(struct brw_state_flags));
> > +   }
> > +
> >     if ((state->mesa | state->brw) == 0)
> >        return;
> >
> > @@ -642,6 +649,10 @@ void brw_upload_render_state(struct brw_context *brw)
> >
> >     brw_upload_programs(brw);
> >
> > +   const struct brw_tracked_state *atoms =
> > +      brw_get_pipeline_atoms(brw, pipeline);
> > +   const int num_atoms = brw->num_atoms[pipeline];
> > +
> >     if (unlikely(INTEL_DEBUG)) {
> >        /* Debug version which enforces various sanity checks on the
> >         * state flags which are generated and checked to help ensure
> > @@ -651,8 +662,8 @@ void brw_upload_render_state(struct brw_context *brw)
> >        memset(&examined, 0, sizeof(examined));
> >        prev = *state;
> >
> > -      for (i = 0; i < brw->num_atoms[BRW_RENDER_PIPELINE]; i++) {
> > -        const struct brw_tracked_state *atom = &brw->render_atoms[i];
> > +      for (i = 0; i < num_atoms; i++) {
> > +        const struct brw_tracked_state *atom = &atoms[i];
> >          struct brw_state_flags generated;
> >
> >          if (check_state(state, &atom->dirty)) {
> > @@ -671,8 +682,8 @@ void brw_upload_render_state(struct brw_context *brw)
> >        }
> >     }
> >     else {
> > -      for (i = 0; i < brw->num_atoms[BRW_RENDER_PIPELINE]; i++) {
> > -        const struct brw_tracked_state *atom = &brw->render_atoms[i];
> > +      for (i = 0; i < num_atoms; i++) {
> > +        const struct brw_tracked_state *atom = &atoms[i];
> >
> >          if (check_state(state, &atom->dirty)) {
> >             atom->emit(brw);
> > @@ -691,8 +702,23 @@ void brw_upload_render_state(struct brw_context *brw)
> >          fprintf(stderr, "\n");
> >        }
> >     }
> > +
> > +   /* Save all dirty state into the other pipelines */
> > +   for (int i = 0; i < BRW_NUM_PIPELINES; i++) {
> > +      if (i != pipeline) {
> > +         brw->state.pipelines[i].mesa |= state->mesa;
> > +         brw->state.pipelines[i].brw |= state->brw;
> > +      }
> > +   }
> 
> When we merge this in at this point, state->mesa/brw includes dirty
> bits from the active pipelines pipeline flags. For example, suppose we
> do a bunch of render pipeline stuff, which queues up a lot of dirty
> flags for the compute pipeline. Then when we go to emit state for the
> compute pipeline we first merge all these dirty flags into
> brw->state.dirty, then merge that into the render pipeline flags. We
> only need to merge the new dirty bits, that is, the initial value of
> brw->state.dirty into the render pipeline flags. If we don't modify
> brw->state.dirty in this function and merge the flags in clear as
> described above, this problem is easy to fix.
> 
> Kristian
> 
> >  }
> >
> > +/***********************************************************************
> > + * Emit all state:
> > + */
> > +void brw_upload_render_state(struct brw_context *brw)
> > +{
> > +   brw_upload_pipeline_state(brw, BRW_RENDER_PIPELINE);
> > +}
> >
> >  /**
> >   * Clear dirty bits to account for the fact that the state emitted by
> > --
> > 2.1.4
> >
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150317/b7815020/attachment.sig>