[Mesa-dev] [PATCH 3/4] i965/state: Create separate dirty state bits for each pipeline
Jordan Justen
jordan.l.justen at intel.com
Tue Mar 10 16:32:19 PDT 2015
On 2015-03-10 16:11:08, Kristian Høgsberg wrote:
> On Tue, Mar 10, 2015 at 10:49 AM, Jordan Justen
> <jordan.l.justen at intel.com> wrote:
> > When uploading state for a pipeline, we will save changed state for
> > the other pipelines.
> >
> > Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> > ---
> > src/mesa/drivers/dri/i965/brw_context.h | 1 +
> > src/mesa/drivers/dri/i965/brw_state_upload.c | 41 ++++++++++++++++++++++------
> > 2 files changed, 34 insertions(+), 8 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h
> > index fd43b07..902de18 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1101,6 +1101,7 @@ struct brw_context
> > GLuint NewGLState;
> > struct {
> > struct brw_state_flags dirty;
> > + struct brw_state_flags pipelines[BRW_NUM_PIPELINES];
> > } state;
> >
> > struct brw_cache cache;
> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > index dbfdc92..6966f06 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > @@ -580,15 +580,16 @@ brw_upload_programs(struct brw_context *brw)
> > brw_upload_wm_prog(brw);
> > }
> >
> > -/***********************************************************************
> > - * Emit all state:
> > - */
> > -void brw_upload_render_state(struct brw_context *brw)
> > +static inline void
> > +brw_upload_pipeline_state(struct brw_context *brw,
> > + enum brw_pipeline pipeline)
> > {
> > struct gl_context *ctx = &brw->ctx;
> > struct brw_state_flags *state = &brw->state.dirty;
> > int i;
> > static int dirty_count = 0;
> > + struct brw_state_flags *pipeline_state =
> > + &brw->state.pipelines[pipeline];
> >
> > state->mesa |= brw->NewGLState;
> > brw->NewGLState = 0;
> > @@ -627,6 +628,12 @@ void brw_upload_render_state(struct brw_context *brw)
> > brw->state.dirty.brw |= BRW_NEW_NUM_SAMPLES;
> > }
> >
> > + if ((pipeline_state->mesa | pipeline_state->brw) != 0) {
> > + state->mesa |= pipeline_state->mesa;
> > + state->brw |= pipeline_state->brw;
> > + memset(pipeline_state, 0, sizeof(struct brw_state_flags));
> > + }
> > +
> > if ((state->mesa | state->brw) == 0)
> > return;
> >
> > @@ -636,6 +643,9 @@ void brw_upload_render_state(struct brw_context *brw)
> >
> > brw_upload_programs(brw);
> >
> > + const struct brw_tracked_state *atoms =
> > + brw_pipeline_first_atom(brw, pipeline);
>
> Instead of looping over num_atoms[] to determine the start of the
> compute pipeline atoms, I think it'd be simpler to just have the
> compute atoms be a separate array in struct brw_context.
I showed Ken an earlier version that had this done differently. In
that case, I stored the start/end positions in an array. Ken's
feedback was that it was a little overly complicated.
Regarding cpu time, since this function, and brw_pipeline_first_atom
are inline functions, I think the compiler can figure out that the
atoms for the render pipeline start at brw->atoms[0] without accessing
brw->num_atoms. For the compute version, I think it will only need to
access brw->num_atoms[0], and it ought to do both of these without
looping.
I should confirm in the generated code. :)
> > if (unlikely(INTEL_DEBUG)) {
> > /* Debug version which enforces various sanity checks on the
> > * state flags which are generated and checked to help ensure
> > @@ -645,8 +655,8 @@ void brw_upload_render_state(struct brw_context *brw)
> > memset(&examined, 0, sizeof(examined));
> > prev = *state;
> >
> > - for (i = 0; i < brw->num_atoms[BRW_RENDER_PIPELINE]; i++) {
> > - const struct brw_tracked_state *atom = &brw->atoms[i];
> > + for (i = 0; i < brw->num_atoms[pipeline]; i++) {
>
> This makes the loop condition a more complex expression. I don't
> think gcc computes and caches brw->num_atoms[pipeline] outside the
> loop so this adds a bit of code to this very hot loop. I'd try
> something like
Since this function becomes an inline function, I think this change
should still be a single read from the brw->num_atoms[] array. (So,
the same as when supporting a single pipeline with brw->num_atoms
being an int.)
Once again, I'll confirm.
-Jordan
More information about the mesa-dev
mailing list