[Mesa-dev] [PATCH] i965 : Performance Improvement

Chris Wilson chris at chris-wilson.co.uk
Fri Jul 14 07:57:02 UTC 2017


Quoting aravindan.muthukumar at intel.com (2017-07-14 05:09:09)
> From: Aravindan M <aravindan.muthukumar at intel.com>
> 
> This patch improves CPI Rate(Cycles per Instruction)
> and CPU time utilization for i965. The functions
> check_state and brw_pipeline_state_finished was found
> poor CPU utilization from performance analysis.
> 
> Change-Id: I17c7e719a16e222764217a0e67b4482748537b67
> Signed-off-by: Aravindan M <aravindan.muthukumar at intel.com>
> Reviewed-by: Yogesh M <yogesh.marathe at intel.com>
> Tested-by: Asish <asish at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h      |  3 +++
>  src/mesa/drivers/dri/i965/brw_state_upload.c | 14 +++++++++++---
>  2 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
> index a4794c6..60f88ca 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -1681,3 +1681,6 @@ enum brw_pixel_shader_coverage_mask_mode {
>  # define GEN8_L3CNTLREG_ALL_ALLOC_MASK     INTEL_MASK(31, 25)
>  
>  #endif
> +
> +/* Checking the state of mesa and brw before emitting atoms */
> +#define CHECK_BRW_STATE(a,b) ((a.mesa & b.mesa) | (a.brw & b.brw))
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index 5e82c1b..434decf 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -515,7 +515,10 @@ brw_upload_pipeline_state(struct brw_context *brw,
>          const struct brw_tracked_state *atom = &atoms[i];
>          struct brw_state_flags generated;
>  
> -         check_and_emit_atom(brw, &state, atom);
> +         /* Checking the state and emitting the atoms */
> +         if (CHECK_BRW_STATE(state, atom->dirty)) {
> +            check_and_emit_atom(brw, &state, atom);
> +         }
>  
>          accumulate_state(&examined, &atom->dirty);
>  
> @@ -532,7 +535,10 @@ brw_upload_pipeline_state(struct brw_context *brw,
>        for (i = 0; i < num_atoms; i++) {
>          const struct brw_tracked_state *atom = &atoms[i];
>  
> -         check_and_emit_atom(brw, &state, atom);
> +         /* Checking the state and emitting the atoms */
> +         if (CHECK_BRW_STATE(state, atom->dirty)) {
> +            check_and_emit_atom(brw, &state, atom);
> +         }
>        }
>     }
>  
> @@ -567,7 +573,9 @@ brw_pipeline_state_finished(struct brw_context *brw,
>           brw->state.pipelines[i].mesa |= brw->NewGLState;
>           brw->state.pipelines[i].brw |= brw->ctx.NewDriverState;
>        } else {
> -         memset(&brw->state.pipelines[i], 0, sizeof(struct brw_state_flags));
> +         /* Avoiding the memset with initialization */
> +         brw->state.pipelines[i].mesa = 0;
> +         brw->state.pipelines[i].brw = 0ull;

Is your compiler broken? Neither inlining the simple function
check_and_emit_atom, which may be a candidate for always inline instead
of the manual duplication, nor converting the fixed size memset into the
few inline instructions.

Or are you optimising a debug build?
-Chris


More information about the mesa-dev mailing list