[PATCH 1/2] drm/xe: Skip CCS clear for WB type BOs
Thomas Hellström
thomas.hellstrom at linux.intel.com
Wed Aug 28 08:23:09 UTC 2024
Hi,
On Tue, 2024-08-27 at 17:49 +0200, Nirmoy Das wrote:
> HW treats any access to 1-way or 2-way coherent memory as compression
> disabled memory. So for such BOs there is no need to do CCS clearing.
>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> ---
> drivers/gpu/drm/xe/xe_bo.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index cbe7bf098970..24701272e3af 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -283,6 +283,7 @@ struct xe_ttm_tt {
> struct device *dev;
> struct sg_table sgt;
> struct sg_table *sg;
> + bool skip_ccs_clear:1;
> };
>
> static int xe_tt_map_sg(struct ttm_tt *tt)
> @@ -404,6 +405,8 @@ static struct ttm_tt *xe_ttm_tt_create(struct
> ttm_buffer_object *ttm_bo,
> if (ttm_bo->type == ttm_bo_type_device && xe-
> >mem.gpu_page_clear_sys)
> page_flags |= TTM_TT_FLAG_CLEARED_ON_FREE;
>
> + /* compression is not allowed for cached BO so ccs clear can
> be skipped. */
> + tt->skip_ccs_clear = caching == ttm_cached;
In theory, BOs that are promoted to fb (not created with the SCANOUT
flag) can AFAICT have caching remaining at ttm_cached, yet still sent
to the display engine, reading uninitialized ccs.
Also I think LNL will be the only HW having the "feature" that clean
cache-lines are written back so in the future we might allow 0-coherent
with ttm_cached.
So IMO we need to improve the detection of "skip_ccs_clear" here.
Otherwise, I'm all for the optimizaion.
/Thomas
> err = ttm_tt_init(&tt->ttm, &bo->ttm, page_flags, caching,
> extra_pages);
> if (err) {
> kfree(tt);
> @@ -664,13 +667,16 @@ static int xe_bo_move(struct ttm_buffer_object
> *ttm_bo, bool evict,
> struct ttm_resource *old_mem = ttm_bo->resource;
> u32 old_mem_type = old_mem ? old_mem->mem_type :
> XE_PL_SYSTEM;
> struct ttm_tt *ttm = ttm_bo->ttm;
> + struct xe_ttm_tt *xe_tt = container_of(ttm_bo->ttm, struct
> xe_ttm_tt,
> + ttm);
> struct xe_migrate *migrate = NULL;
> struct dma_fence *fence;
> bool move_lacks_source;
> bool tt_has_data;
> bool needs_clear;
> bool handle_system_ccs = (!IS_DGFX(xe) &&
> xe_bo_needs_ccs_pages(bo) &&
> - ttm && ttm_tt_is_populated(ttm)) ?
> true : false;
> + ttm && ttm_tt_is_populated(ttm) &&
> + !xe_tt->skip_ccs_clear) ? true :
> false;
> bool clear_system_pages;
> int ret = 0;
>
More information about the Intel-xe
mailing list