[PATCH] drm/xe/bo: optimise CCS case for WB pages
Matt Roper
matthew.d.roper at intel.com
Fri May 16 19:12:03 UTC 2025
On Fri, May 16, 2025 at 04:38:11PM +0100, Matthew Auld wrote:
> Dealing with CCS state is significant on LNL+, where we end up clearing
> the compression state on every page alloc using the blitter for user
> buffers, including also saving and restoring it when moving between
> domains, plus we need to alloc extra pages to hold the raw CCS state for
> the save step.
>
> However all compression PAT modes, on platforms like LNL, also require
> coh_none, meaning that only WC memory can use compression in the first
On PTL/Xe3 there's a new PAT entry 16 that has CCS compression + 1-way
coherency (according to bspec page 71582). It looks like we don't have
that in the driver yet today, but we probably need to add it since
userspace is expected to be able to use it.
Matt
> place. With this we can be sneaky and completely ignore CCS for WB
> buffers, which is likely the common case anyway. This would then skip
> all blitter moves/clears between sys <-> tt and then also means we can
> drop the extra CCS pages.
>
> This should be safe since there is no way to interact with the
> compression state (potentially uncleared) without using a PAT enabled
> index (which is rejected at bind), including if trying to be malicious
> and copy the raw CCS state from userpace, which should give back all
> zeroes if the src surface (indirect) is lacking compressed PAT index.
>
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Satyanarayana K V P <satyanarayana.k.v.p at intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> ---
> drivers/gpu/drm/xe/xe_bo.c | 8 ++++++++
> drivers/gpu/drm/xe/xe_pat.c | 3 ++-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index d99d91fe8aa9..3fafdcb8d95b 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -2982,6 +2982,14 @@ bool xe_bo_needs_ccs_pages(struct xe_bo *bo)
> if (IS_DGFX(xe) && (bo->flags & XE_BO_FLAG_SYSTEM))
> return false;
>
> + /*
> + * Compression implies coh_none, therefore we know for sure that WB
> + * memory can't currently use compression, which is likely one of the
> + * common cases.
> + */
> + if (bo->cpu_caching == DRM_XE_GEM_CPU_CACHING_WB)
> + return false;
> +
> return true;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_pat.c b/drivers/gpu/drm/xe/xe_pat.c
> index 30fdbdb9341e..38a6a49c1b2a 100644
> --- a/drivers/gpu/drm/xe/xe_pat.c
> +++ b/drivers/gpu/drm/xe/xe_pat.c
> @@ -103,7 +103,8 @@ static const struct xe_pat_table_entry xelpg_pat_table[] = {
> *
> * Note: There is an implicit assumption in the driver that compression and
> * coh_1way+ are mutually exclusive. If this is ever not true then userptr
> - * and imported dma-buf from external device will have uncleared ccs state.
> + * and imported dma-buf from external device will have uncleared ccs state. See
> + * also xe_bo_needs_ccs_pages().
> */
> #define XE2_PAT(no_promote, comp_en, l3clos, l3_policy, l4_policy, __coh_mode) \
> { \
> --
> 2.49.0
>
--
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation
More information about the Intel-xe
mailing list