[PATCH] drm/xe/bo: optimise CCS case for WB pages
Matthew Brost
matthew.brost at intel.com
Tue May 27 22:32:51 UTC 2025
On Mon, May 19, 2025 at 10:58:42AM +0100, Matthew Auld wrote:
> On 16/05/2025 20:12, Matt Roper wrote:
> > On Fri, May 16, 2025 at 04:38:11PM +0100, Matthew Auld wrote:
> > > Dealing with CCS state is significant on LNL+, where we end up clearing
> > > the compression state on every page alloc using the blitter for user
> > > buffers, including also saving and restoring it when moving between
> > > domains, plus we need to alloc extra pages to hold the raw CCS state for
> > > the save step.
> > >
> > > However all compression PAT modes, on platforms like LNL, also require
> > > coh_none, meaning that only WC memory can use compression in the first
> >
> > On PTL/Xe3 there's a new PAT entry 16 that has CCS compression + 1-way
> > coherency (according to bspec page 71582). It looks like we don't have
> > that in the driver yet today, but we probably need to add it since
> > userspace is expected to be able to use it.
>
> Right, for that we are also missing userptr handling, if we can't just
> reject it (it will also currently throw a build error IIRC), and also need
> to figure out what to do with external imported dma-buf, I assume we just
> reject at bind time? Using compression with external dma-buf I assume is not
> going to work.
>
> Jose had the idea of maybe adding a bo_create ioctl flag to opt of using
> compression, which we could use a hint for when to apply this optimisation
> more generally. Also would benefit SRIOV VF case where we can skip needing
> to deal with potential CCS state on PTL.
>
This seems worth while to do.
Matt
> >
> >
> > Matt
> >
> > > place. With this we can be sneaky and completely ignore CCS for WB
> > > buffers, which is likely the common case anyway. This would then skip
> > > all blitter moves/clears between sys <-> tt and then also means we can
> > > drop the extra CCS pages.
> > >
> > > This should be safe since there is no way to interact with the
> > > compression state (potentially uncleared) without using a PAT enabled
> > > index (which is rejected at bind), including if trying to be malicious
> > > and copy the raw CCS state from userpace, which should give back all
> > > zeroes if the src surface (indirect) is lacking compressed PAT index.
> > >
> > > Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> > > Cc: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> > > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_bo.c | 8 ++++++++
> > > drivers/gpu/drm/xe/xe_pat.c | 3 ++-
> > > 2 files changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > > index d99d91fe8aa9..3fafdcb8d95b 100644
> > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > @@ -2982,6 +2982,14 @@ bool xe_bo_needs_ccs_pages(struct xe_bo *bo)
> > > if (IS_DGFX(xe) && (bo->flags & XE_BO_FLAG_SYSTEM))
> > > return false;
> > > + /*
> > > + * Compression implies coh_none, therefore we know for sure that WB
> > > + * memory can't currently use compression, which is likely one of the
> > > + * common cases.
> > > + */
> > > + if (bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB)
> > > + return false;
> > > +
> > > return true;
> > > }
> > > diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
> > > index 30fdbdb9341e..38a6a49c1b2a 100644
> > > --- a/drivers/gpu/drm/xe/xe_pat.c
> > > +++ b/drivers/gpu/drm/xe/xe_pat.c
> > > @@ -103,7 +103,8 @@ static const struct xe_pat_table_entry xelpg_pat_table[] = {
> > > *
> > > * Note: There is an implicit assumption in the driver that compression and
> > > * coh_1way+ are mutually exclusive. If this is ever not true then userptr
> > > - * and imported dma-buf from external device will have uncleared ccs state.
> > > + * and imported dma-buf from external device will have uncleared ccs state. See
> > > + * also xe_bo_needs_ccs_pages().
> > > */
> > > #define XE2_PAT(no_promote, comp_en, l3clos, l3_policy, l4_policy, __coh_mode) \
> > > { \
> > > --
> > > 2.49.0
> > >
> >
>
More information about the Intel-xe
mailing list