[PATCH v4 1/9] drm/xe/xelpg: Move Wa_14016712196 to the invalidate path

Rodrigo Vivi rodrigo.vivi at intel.com
Mon Mar 31 18:54:43 UTC 2025


On Fri, Mar 28, 2025 at 04:35:28PM +0000, Tvrtko Ursulin wrote:
> According to i915 Wa_14016712196 needs to be emmited before a
> pipe control which contains a post sync operation.
> 
> Therefore move it from flush (no post sync) to invalidate (post sync).
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
> Fixes: 8c5fe7d88bc1 ("drm/xe: Add Wa_16021333562 and Wa_14016712196")
> Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
> Cc: Aradhya Bhatia <aradhya.bhatia at intel.com>
> Cc: Matt Roper <matthew.d.roper at intel.com>
> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> ---
> Please double check.

It looks like both options are possible.

We either insert a PIPE_CONTROL with "**Depth Flush** " post any state that
   will send an implicit depth flush and prior to any PIPE_CONTROL. This
   PIPE_CONTROL is not required if 3DPRIMITIVE, 3DMESH or PIPE_CONTROL that
   doesn't require end of the pipe drain is programmed prior to a PIPE_CONTROL
   that requires end of the pipe drain and hits this issue.

or

We insert a this pipe_control prior to PIPE_CONTROL which will hit the issue.  For
   timestamp, this could be a replicated pipe_control as it will write the
   correct value the 2nd time.  For post sync with write immediate, SW would have
   to allocate a dummy address.

So, is changing the order really helping your case or needed for your case?


> ---
>  drivers/gpu/drm/xe/xe_ring_ops.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
> index 917fc16de866..88591b7a7715 100644
> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> @@ -134,8 +134,9 @@ emit_pipe_control(u32 *dw, int i, u32 bit_group_0, u32 bit_group_1, u32 offset,
>  	return i;
>  }
>  
> -static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
> -				int i)
> +static int
> +emit_pipe_invalidate(struct xe_gt *gt, u32 mask_flags, bool invalidate_tlb,
> +		     u32 *dw, int i)
>  {
>  	u32 flags = PIPE_CONTROL_CS_STALL |
>  		PIPE_CONTROL_COMMAND_CACHE_INVALIDATE |
> @@ -152,6 +153,10 @@ static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, u32 *dw,
>  
>  	flags &= ~mask_flags;
>  
> +	if (XE_WA(gt, 14016712196))
> +		i = emit_pipe_control(dw, i, 0, PIPE_CONTROL_DEPTH_CACHE_FLUSH,
> +				      LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
> +
>  	return emit_pipe_control(dw, i, 0, flags, LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
>  }
>  
> @@ -173,10 +178,6 @@ static int emit_render_cache_flush(struct xe_sched_job *job, u32 *dw, int i)
>  	bool lacks_render = !(gt->info.engine_mask & XE_HW_ENGINE_RCS_MASK);
>  	u32 flags;
>  
> -	if (XE_WA(gt, 14016712196))
> -		i = emit_pipe_control(dw, i, 0, PIPE_CONTROL_DEPTH_CACHE_FLUSH,
> -				      LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
> -
>  	flags = (PIPE_CONTROL_CS_STALL |
>  		 PIPE_CONTROL_TILE_CACHE_FLUSH |
>  		 PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH |
> @@ -361,7 +362,8 @@ static void __emit_job_gen12_render_compute(struct xe_sched_job *job,
>  		mask_flags = PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
>  	/* See __xe_pt_bind_vma() for a discussion on TLB invalidations. */
> -	i = emit_pipe_invalidate(mask_flags, job->ring_ops_flush_tlb, dw, i);
> +	i = emit_pipe_invalidate(gt, mask_flags, job->ring_ops_flush_tlb, dw,
> +				 i);
>  
>  	/* hsdes: 1809175790 */
>  	if (has_aux_ccs(xe))
> -- 
> 2.48.0
> 


More information about the Intel-xe mailing list