[PATCH] drm/xe: Use ttm_uncached for BO with NEEDS_UC flag

Mon Jun 17 20:28:38 UTC 2024

On Wed, Jun 12, 2024 at 08:03:24PM +0200, Michal Wajdeczko wrote:
> Hi Thomas,
> 
> On 11.06.2024 14:47, Thomas Hellström wrote:
> > Hi, Michal,
> > 
> > On Thu, 2024-06-06 at 21:56 +0200, Michal Wajdeczko wrote:
> >> We should honor requested uncached mode also at the TTM layer.
> >> Otherwise, we risk losing updates to the memory based interrupts
> >> source or status vectors, as those require uncached memory.
> >>
> >> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> >> Cc: Matt Roper <matthew.d.roper at intel.com>
> >> ---
> >>  drivers/gpu/drm/xe/xe_bo.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> >> index 2bae01ce4e5b..2573cc118f29 100644
> >> --- a/drivers/gpu/drm/xe/xe_bo.c
> >> +++ b/drivers/gpu/drm/xe/xe_bo.c
> >> @@ -378,6 +378,9 @@ static struct ttm_tt *xe_ttm_tt_create(struct
> >> ttm_buffer_object *ttm_bo,
> >>  	    (xe->info.graphics_verx100 >= 1270 && bo->flags &
> >> XE_BO_FLAG_PAGETABLE))
> >>  		caching = ttm_write_combined;
> >>  
> >> +	if (bo->flags & XE_BO_FLAG_NEEDS_UC)
> >> +		caching = ttm_uncached;
> >> +
> >>  	err = ttm_tt_init(&tt->ttm, &bo->ttm, page_flags, caching,
> >> extra_pages);
> >>  	if (err) {
> >>  		kfree(tt);
> > 
> > To me the preferred method is to teach bo->cpu_caching about the
> > uncached mode and then include it in the switch statement above.
> > 
> 
> but bo->cpu_caching is currently documented as:
> 
> /**
>  * @cpu_caching: CPU caching mode. Currently only used for userspace
>  * objects.
>  */
> 
> and value 0 is implicitly reserved as kind of default, so 'teaching'
> would likely mean either extending uapi with something like:
> 
>   #define DRM_XE_GEM_CPU_CACHING_WB                      1
>   #define DRM_XE_GEM_CPU_CACHING_WC                      2
> + #define DRM_XE_GEM_CPU_CACHING_UC                      3
> 
> which will introduce lot of undesired right now code changes, or we will
> introduce internal only flag:
> 
> + #define XE_CPU_CACHING_UC                      ((u16)~0)
> 
> but that doesn't look like a clean solution.
> 
> 
> OTOH, just above this new diff chunk, there is already a code that
> updates caching mode outside the "switch statement above":
> 
> 	if ((!bo->cpu_caching && bo->flags & XE_BO_FLAG_SCANOUT) ||
> 	    (xe->info.graphics_verx100 >= 1270 &&
> 	     bo->flags & XE_BO_FLAG_PAGETABLE))
> 		caching = ttm_write_combined;
> 
> so maybe as a short term solution we can keep this patch as it's doing
> similar last resort stuff and return to 'preferred way' later:
> 
> 	if (!bo->cpu_caching && bo->flags & XE_BO_FLAG_NEEDS_UC)
> 		caching = ttm_uncached;

Yeah, cpu_caching is a "uapi only" thing at the moment (and even then is
only set in some situations).  Given the current design and assumptions
of the code, maybe it would be more clear to add an assertion like this
to help document why this is special?

        if (bo->flags & XE_BO_FLAG_NEEDS_UC) {
                /*
                 * Valid only for internally-created buffers only, for
                 * which cpu_caching is never initialized.
                 */
                xe_assert(xe, bo->cpu_caching == 0);
                caching = ttm_uncached;
        }

If we decide we want a more general redesign of cpu_caching behavior,
that would probably be a separate change from the direct functional fix
here.

Matt

> 
> Michal

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation