[PATCH v2] drm/xe/xe2: Use XE_CACHE_WB pat index

Ghimiray, Himal Prasad himal.prasad.ghimiray at intel.com
Wed Jan 3 03:17:22 UTC 2024



> -----Original Message-----
> From: Roper, Matthew D <matthew.d.roper at intel.com>
> Sent: 03 January 2024 06:46
> To: Ghimiray, Himal Prasad <himal.prasad.ghimiray at intel.com>
> Cc: intel-xe at lists.freedesktop.org
> Subject: Re: [PATCH v2] drm/xe/xe2: Use XE_CACHE_WB pat index
> 
> On Tue, Jan 02, 2024 at 10:21:42PM +0530, Himal Prasad Ghimiray wrote:
> > Use XE_CACHE_WB pat index by default on xe2 platforms and use
> > XE_CACHE_NONE_COMPRESSION pat index only if buffer supports
> > compression and indirect access of flat ccs.
> >
> > Fixes the broken "xe_migrate_test kunit test" on LNL.
> 
> How/why does it fix it?  Explaining why this is the right thing to do is more
> important than just stating that a test changes from red to green.
> Does the WB vs UC even matter here or is it actually the coherency with the
> CPU cache that matters (meaning UC+1way would also work, but
> WB+NonCoh would fail)?

AFAIU the coherency with CPU cache is main thing that matters here. It shouldn't be really related to 
WB vs UC.  The pat table entry associated with XE_CACHE_WB is coherent whereas XE_CACHE_NONE is 
NonCoherent.  UC+1way would also work, but WB+NonCoh would fail , I had tested using pat index as 0 and 5 respectively.

BR
Himal

> 
> 
> Matt
> 
> >
> > v2
> > - Fix the argument to emit_pte, pass the bool directly. (Thomas)
> >
> > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_migrate.c | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_migrate.c
> > b/drivers/gpu/drm/xe/xe_migrate.c index adf1dab5eba2..e1a849d74826
> > 100644
> > --- a/drivers/gpu/drm/xe/xe_migrate.c
> > +++ b/drivers/gpu/drm/xe/xe_migrate.c
> > @@ -439,7 +439,7 @@ static void emit_pte(struct xe_migrate *m,
> >  	/* Indirect access needs compression enabled uncached PAT index */
> >  	if (GRAPHICS_VERx100(xe) >= 2000)
> >  		pat_index = is_comp_pte ? xe-
> >pat.idx[XE_CACHE_NONE_COMPRESSION] :
> > -					  xe->pat.idx[XE_CACHE_NONE];
> > +					  xe->pat.idx[XE_CACHE_WB];
> >  	else
> >  		pat_index = xe->pat.idx[XE_CACHE_WB];
> >
> > @@ -729,14 +729,14 @@ struct dma_fence *xe_migrate_copy(struct
> xe_migrate *m,
> >  		}
> >
> >  		if (!src_is_vram)
> > -			emit_pte(m, bb, src_L0_pt, src_is_vram, true,
> &src_it, src_L0,
> > -				 src_bo);
> > +			emit_pte(m, bb, src_L0_pt, src_is_vram,
> copy_system_ccs,
> > +				 &src_it, src_L0, src_bo);
> >  		else
> >  			xe_res_next(&src_it, src_L0);
> >
> >  		if (!dst_is_vram)
> > -			emit_pte(m, bb, dst_L0_pt, dst_is_vram, true,
> &dst_it, src_L0,
> > -				 dst_bo);
> > +			emit_pte(m, bb, dst_L0_pt, dst_is_vram,
> copy_system_ccs,
> > +				 &dst_it, src_L0, dst_bo);
> >  		else
> >  			xe_res_next(&dst_it, src_L0);
> >
> > @@ -978,8 +978,8 @@ struct dma_fence *xe_migrate_clear(struct
> xe_migrate *m,
> >  		size -= clear_L0;
> >  		/* Preemption is enabled again by the ring ops. */
> >  		if (!clear_vram) {
> > -			emit_pte(m, bb, clear_L0_pt, clear_vram, true,
> &src_it, clear_L0,
> > -				 bo);
> > +			emit_pte(m, bb, clear_L0_pt, clear_vram,
> clear_system_ccs,
> > +				 &src_it, clear_L0, bo);
> >  		} else {
> >  			xe_res_next(&src_it, clear_L0);
> >  		}
> > --
> > 2.25.1
> >
> 
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation


More information about the Intel-xe mailing list