[PATCH] drm/xe: Use ttm_uncached for BO with NEEDS_UC flag
Thomas Hellström
thomas.hellstrom at linux.intel.com
Tue Jun 18 18:29:24 UTC 2024
Hi, Matt
On Tue, 2024-06-18 at 09:43 -0700, Matt Roper wrote:
> On Tue, Jun 18, 2024 at 02:38:01PM +0200, Thomas Hellström wrote:
> > Hi,
> >
> > On Mon, 2024-06-17 at 13:28 -0700, Matt Roper wrote:
> > > On Wed, Jun 12, 2024 at 08:03:24PM +0200, Michal Wajdeczko wrote:
> > > > Hi Thomas,
> > > >
> > > > On 11.06.2024 14:47, Thomas Hellström wrote:
> > > > > Hi, Michal,
> > > > >
> > > > > On Thu, 2024-06-06 at 21:56 +0200, Michal Wajdeczko wrote:
> > > > > > We should honor requested uncached mode also at the TTM
> > > > > > layer.
> > > > > > Otherwise, we risk losing updates to the memory based
> > > > > > interrupts
> > > > > > source or status vectors, as those require uncached memory.
> > > > > >
> > > > > > Signed-off-by: Michal Wajdeczko
> > > > > > <michal.wajdeczko at intel.com>
> > > > > > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > > > > > Cc: Matt Roper <matthew.d.roper at intel.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/xe/xe_bo.c | 3 +++
> > > > > > 1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c
> > > > > > b/drivers/gpu/drm/xe/xe_bo.c
> > > > > > index 2bae01ce4e5b..2573cc118f29 100644
> > > > > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > > > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > > > > @@ -378,6 +378,9 @@ static struct ttm_tt
> > > > > > *xe_ttm_tt_create(struct
> > > > > > ttm_buffer_object *ttm_bo,
> > > > > > (xe->info.graphics_verx100 >= 1270 && bo-
> > > > > > >flags &
> > > > > > XE_BO_FLAG_PAGETABLE))
> > > > > > caching = ttm_write_combined;
> > > > > >
> > > > > > + if (bo->flags & XE_BO_FLAG_NEEDS_UC)
> > > > > > + caching = ttm_uncached;
> > > > > > +
> > > > > > err = ttm_tt_init(&tt->ttm, &bo->ttm, page_flags,
> > > > > > caching,
> > > > > > extra_pages);
> > > > > > if (err) {
> > > > > > kfree(tt);
> > > > >
> > > > > To me the preferred method is to teach bo->cpu_caching about
> > > > > the
> > > > > uncached mode and then include it in the switch statement
> > > > > above.
> > > > >
> > > >
> > > > but bo->cpu_caching is currently documented as:
> > > >
> > > > /**
> > > > * @cpu_caching: CPU caching mode. Currently only used for
> > > > userspace
> > > > * objects.
> > > > */
> > > >
> > > > and value 0 is implicitly reserved as kind of default, so
> > > > 'teaching'
> > > > would likely mean either extending uapi with something like:
> > > >
> > > > #define DRM_XE_GEM_CPU_CACHING_WB 1
> > > > #define DRM_XE_GEM_CPU_CACHING_WC 2
> > > > + #define DRM_XE_GEM_CPU_CACHING_UC 3
> > > >
> > > > which will introduce lot of undesired right now code changes,
> > > > or we
> > > > will
> > > > introduce internal only flag:
> > > >
> > > > + #define XE_CPU_CACHING_UC ((u16)~0)
> > > >
> > > > but that doesn't look like a clean solution.
> > > >
> > > >
> > > > OTOH, just above this new diff chunk, there is already a code
> > > > that
> > > > updates caching mode outside the "switch statement above":
> > > >
> > > > if ((!bo->cpu_caching && bo->flags &
> > > > XE_BO_FLAG_SCANOUT)
> > > > > >
> > > > (xe->info.graphics_verx100 >= 1270 &&
> > > > bo->flags & XE_BO_FLAG_PAGETABLE))
> > > > caching = ttm_write_combined;
> > > >
> > > > so maybe as a short term solution we can keep this patch as
> > > > it's
> > > > doing
> > > > similar last resort stuff and return to 'preferred way' later:
> > > >
> > > > if (!bo->cpu_caching && bo->flags &
> > > > XE_BO_FLAG_NEEDS_UC)
> > > > caching = ttm_uncached;
> > >
> > > Yeah, cpu_caching is a "uapi only" thing at the moment (and even
> > > then
> > > is
> > > only set in some situations). Given the current design and
> > > assumptions
> > > of the code, maybe it would be more clear to add an assertion
> > > like
> > > this
> > > to help document why this is special?
> > >
> > > if (bo->flags & XE_BO_FLAG_NEEDS_UC) {
> > > /*
> > > * Valid only for internally-created buffers
> > > only,
> > > for
> > > * which cpu_caching is never initialized.
> > > */
> > > xe_assert(xe, bo->cpu_caching == 0);
> > > caching = ttm_uncached;
> > > }
> > >
> > > If we decide we want a more general redesign of cpu_caching
> > > behavior,
> > > that would probably be a separate change from the direct
> > > functional
> > > fix
> > > here.
> >
> > I do think the change should have actually been done before the
> > scanout
> > caching hack. We shouldn't be building special cases like this, but
> > rather fix what's missing.
>
> I think things happened the other way around. The scanout caching
> adjustment pre-dates the existence of bo->cpu_caching in the driver.
> When bo->cpu_caching got added in commit 622f709ca629 ("drm/xe/uapi:
> Add
> support for CPU caching mode"), it intentionally left kernel objects
> set
> to 0 by design.
I don't really see a discussion around the kernel objects in the commit
message, it mentions "currently" in the caching mode doc, but that
sounds more like a documentation of current status than a guideline.
Ofc, it might be in the review discussion but I haven't looked too
closely TBH.
Hmm. This commit raises some questions. Do you remember the reason for
leaving out the kernel bos? Was it because kernel bos relies on
implicit caching mode selection whereas the user-space bo caching mode
selection is now explicit? Otherwise it's pretty standard in the driver
to map the DRM_XE user-space flags / enums to XE_ internal ones.
(As a complete side note it looks like the system pages for VRAM-only
user-bo eviction are now forced to be write-combined by the uAPI?)
>
> We can certainly change the design now if everyone agrees that it
> makes
> the code cleaner,
This is probably a bigger change than I originally though and requires
some additional consideration of the above.
> but I think that the general refactoring and
> repurposing of bo->cpu_caching is orthogonal to the functional fix
> that
> Michal is providing here.
Generally I think that if something is missing to be able to cleanly
add a fix, then we should definitely try our best to add that something
_first_. In this case it turns out, though, that it requires some
additional afterthought.
So for the patch
Acked-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>
> +Cc Pallavi, Matt Auld as the authors of the original design in case
> they have any thoughts on extending the usage of bo->cpu_caching.
>
>
> Matt
>
> >
> > Can't we make bo->cpu_caching valid also for kernel bos with a new
> > enum
> > and do the translation in the ioctl?
> >
> > /Thomas
> >
> >
> > >
> > >
> > > Matt
> > >
> > > >
> > > > Michal
> > >
> >
>
More information about the Intel-xe
mailing list