[Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
Marathe, Yogesh
yogesh.marathe at intel.com
Thu Jul 20 18:30:46 UTC 2017
Francisco,
> -----Original Message-----
> From: mesa-dev [mailto:mesa-dev-bounces at lists.freedesktop.org] On Behalf
> Of Francisco Jerez
> Sent: Thursday, July 20, 2017 10:51 PM
> To: Muthukumar, Aravindan <aravindan.muthukumar at intel.com>; mesa-
> dev at lists.freedesktop.org
> Cc: Muthukumar, Aravindan <aravindan.muthukumar at intel.com>
> Subject: Re: [Mesa-dev] [PATCH V2] i965 : Optimize atom state flag checks
>
> aravindan.muthukumar at intel.com writes:
>
> > From: Aravindan Muthukumar <aravindan.muthukumar at intel.com>
> >
> > This patch improves CPI Rate(Cycles per Instruction) and branch
> > mispredict for i965. The function check_state() was showing CPI
> > retired rate.
> >
> > Performance stats with android:
> > CPI retired lowered by 28% (lower is better) Branch missprediction
> > lowered by 13% (lower is better) 3DMark improved by 2%
> >
> > The dissassembly doesn't show difference, although above results were
> > observed with patch.
> >
>
> How did you determine that your results are not just statistical noise?
No its not statistical noise. As commit msg mentions, we used metrics CPI retired rate,
utilization, branch miss predict as metrics, these can be measured using SEP on IA.
It essentially enables event based sampling and we can measure these through counters.
When we did the analysis of tests we were running, we found
brw_upload_pipeline_state->check_state functions having bad CPI rates and hence
we made changed there. The intention was always to reduce driver overhead, although
this is miniscule effort.
> Did you do some sort of significance testing? Which test, significance level and
> sample size did you use?
Sorry this is not something we have done, we tested on android functionality and
perf only. Performance benchmark 3dmark and overall stability of the android system
were used as tests. Kindly let us know if you have specific tests to be run and we would
be happy to run that.
What CPU did you get the numbers on?
Broxton.
>
> Thank you.
>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar at intel.com>
> > Signedd-off-by: Yogesh Marathe <yogesh.marathe at intel.com>
> > Tested-by: Asish <asish at intel.com>
> > ---
> >
> > Changes since V1:
> > - Removed memset() change
> > - Changed commit message as per review comments
> >
> > src/mesa/drivers/dri/i965/brw_defines.h | 4 ++++
> > src/mesa/drivers/dri/i965/brw_state_upload.c | 12 ++++++++----
> > 2 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index 2a8dbf8..8c9a510 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1687,3 +1687,7 @@ enum brw_pixel_shader_coverage_mask_mode { #
> > define CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE (1 << 4)
> >
> > #endif
> > +
> > +/* Checking the state of mesa and brw before emitting atoms */
> > +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw))
> > +
> > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c
> > b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > index acaa97e..1c8b969 100644
> > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> > @@ -443,10 +443,8 @@ check_and_emit_atom(struct brw_context *brw,
> > struct brw_state_flags *state,
> > const struct brw_tracked_state *atom) {
> > - if (check_state(state, &atom->dirty)) {
> > atom->emit(brw);
> > merge_ctx_state(brw, state);
> > - }
> > }
> >
> > static inline void
> > @@ -541,7 +539,10 @@ brw_upload_pipeline_state(struct brw_context *brw,
> > const struct brw_tracked_state *atom = &atoms[i];
> > struct brw_state_flags generated;
> >
> > - check_and_emit_atom(brw, &state, atom);
> > + /* Checking the state and emitting atoms */
> > + if (CHECK_BRW_STATE(state, atom->dirty)) {
> > + check_and_emit_atom(brw, &state, atom);
> > + }
> >
> > accumulate_state(&examined, &atom->dirty);
> >
> > @@ -558,7 +559,10 @@ brw_upload_pipeline_state(struct brw_context *brw,
> > for (i = 0; i < num_atoms; i++) {
> > const struct brw_tracked_state *atom = &atoms[i];
> >
> > - check_and_emit_atom(brw, &state, atom);
> > + /* Checking the state and emitting atoms */
> > + if (CHECK_BRW_STATE(state, atom->dirty)) {
> > + check_and_emit_atom(brw, &state, atom);
> > + }
> > }
> > }
> >
> > --
> > 2.7.4
> >
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list