[PATCH] drm/xe: Use ttm_uncached for BO with NEEDS_UC flag

Tue Jun 18 12:38:01 UTC 2024

Hi,

On Mon, 2024-06-17 at 13:28 -0700, Matt Roper wrote:
> On Wed, Jun 12, 2024 at 08:03:24PM +0200, Michal Wajdeczko wrote:
> > Hi Thomas,
> > 
> > On 11.06.2024 14:47, Thomas Hellström wrote:
> > > Hi, Michal,
> > > 
> > > On Thu, 2024-06-06 at 21:56 +0200, Michal Wajdeczko wrote:
> > > > We should honor requested uncached mode also at the TTM layer.
> > > > Otherwise, we risk losing updates to the memory based
> > > > interrupts
> > > > source or status vectors, as those require uncached memory.
> > > > 
> > > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> > > > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > > > Cc: Matt Roper <matthew.d.roper at intel.com>
> > > > ---
> > > >  drivers/gpu/drm/xe/xe_bo.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > > > b/drivers/gpu/drm/xe/xe_bo.c
> > > > index 2bae01ce4e5b..2573cc118f29 100644
> > > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > > @@ -378,6 +378,9 @@ static struct ttm_tt
> > > > *xe_ttm_tt_create(struct
> > > > ttm_buffer_object *ttm_bo,
> > > >  	    (xe->info.graphics_verx100 >= 1270 && bo->flags &
> > > > XE_BO_FLAG_PAGETABLE))
> > > >  		caching = ttm_write_combined;
> > > >  
> > > > +	if (bo->flags & XE_BO_FLAG_NEEDS_UC)
> > > > +		caching = ttm_uncached;
> > > > +
> > > >  	err = ttm_tt_init(&tt->ttm, &bo->ttm, page_flags,
> > > > caching,
> > > > extra_pages);
> > > >  	if (err) {
> > > >  		kfree(tt);
> > > 
> > > To me the preferred method is to teach bo->cpu_caching about the
> > > uncached mode and then include it in the switch statement above.
> > > 
> > 
> > but bo->cpu_caching is currently documented as:
> > 
> > /**
> >  * @cpu_caching: CPU caching mode. Currently only used for
> > userspace
> >  * objects.
> >  */
> > 
> > and value 0 is implicitly reserved as kind of default, so
> > 'teaching'
> > would likely mean either extending uapi with something like:
> > 
> >   #define DRM_XE_GEM_CPU_CACHING_WB                      1
> >   #define DRM_XE_GEM_CPU_CACHING_WC                      2
> > + #define DRM_XE_GEM_CPU_CACHING_UC                      3
> > 
> > which will introduce lot of undesired right now code changes, or we
> > will
> > introduce internal only flag:
> > 
> > + #define XE_CPU_CACHING_UC                      ((u16)~0)
> > 
> > but that doesn't look like a clean solution.
> > 
> > 
> > OTOH, just above this new diff chunk, there is already a code that
> > updates caching mode outside the "switch statement above":
> > 
> > 	if ((!bo->cpu_caching && bo->flags & XE_BO_FLAG_SCANOUT)
> > ||
> > 	    (xe->info.graphics_verx100 >= 1270 &&
> > 	     bo->flags & XE_BO_FLAG_PAGETABLE))
> > 		caching = ttm_write_combined;
> > 
> > so maybe as a short term solution we can keep this patch as it's
> > doing
> > similar last resort stuff and return to 'preferred way' later:
> > 
> > 	if (!bo->cpu_caching && bo->flags & XE_BO_FLAG_NEEDS_UC)
> > 		caching = ttm_uncached;
> 
> Yeah, cpu_caching is a "uapi only" thing at the moment (and even then
> is
> only set in some situations).  Given the current design and
> assumptions
> of the code, maybe it would be more clear to add an assertion like
> this
> to help document why this is special?
> 
>         if (bo->flags & XE_BO_FLAG_NEEDS_UC) {
>                 /*
>                  * Valid only for internally-created buffers only,
> for
>                  * which cpu_caching is never initialized.
>                  */
>                 xe_assert(xe, bo->cpu_caching == 0);
>                 caching = ttm_uncached;
>         }
> 
> If we decide we want a more general redesign of cpu_caching behavior,
> that would probably be a separate change from the direct functional
> fix
> here.

I do think the change should have actually been done before the scanout
caching hack. We shouldn't be building special cases like this, but
rather fix what's missing.

Can't we make bo->cpu_caching valid also for kernel bos with a new enum
and do the translation in the ioctl?

/Thomas

> 
> 
> Matt
> 
> > 
> > Michal
>