[PATCH] drm/xe/bmg: improve cache flushing behaviour
Matthew Auld
matthew.auld at intel.com
Tue Sep 3 11:08:28 UTC 2024
On 03/09/2024 11:53, Nirmoy Das wrote:
>
> On 9/2/2024 5:37 PM, Matthew Auld wrote:
>> The BSpec seems to suggest that EN_L3_RW_CCS_CACHE_FLUSH must be toggled
>> on for manual global invalidation to take effect
>
> I couldn't find this reference, in which bspec is this mentioned ?
BSpec: 71718
For discrete under global invalidation it says: "Device Cache flushed if
SCRATCH1LPFC bit[0] is set". Which is why I originally added this,
however this then turns flushing on for all kinds of stuff like
pipecontrol which is not what we want. But from playing around with this
a bunch on BMG that doesn't look to be true. Also the original WA made
no mention of needing to mess with SCRATCH1LPFC.
I'm kind of hoping this helps that compute benchmark with not nuking
entire device cache between submissions.
>
>
> Regards,
>
> Nirmoy
>
>> and actually flush
>> device cache, however this also turns on flushing for things like
>> pipecontrol, which occurs between submissions for compute/render. This
>> sounds like massive overkill for our needs, where we already have the
>> manual flushing on the display side with the global invalidation. Some
>> observations on BMG:
>>
>> 1. Disabling l2 caching for host writes and stubbing out the driver
>> global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has
>> no impact on wb-transient-vs-display IGT, which makes sense since the
>> pipecontrol is now flushing the device cache after the render copy.
>> Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also
>> expected since device cache is now dirty and display engine can't see
>> the writes.
>>
>> 2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global
>> invalidation also has no impact on wb-transient-vs-display. This
>> suggests that the global invalidation still works as expected and is
>> flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on.
>>
>> With that drop EN_L3_RW_CCS_CACHE_FLUSH.
>>
>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>> Cc: Matt Roper <matthew.d.roper at intel.com>
>> Cc: Nirmoy Das <nirmoy.das at intel.com>
>> ---
>> drivers/gpu/drm/xe/regs/xe_gt_regs.h | 3 ---
>> drivers/gpu/drm/xe/xe_gt.c | 1 -
>> 2 files changed, 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> index 0d1a4a9f4e11..88a01970cc5c 100644
>> --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h
>> @@ -387,9 +387,6 @@
>> #define XE2_GLOBAL_INVAL XE_REG(0xb404)
>> -#define SCRATCH1LPFC XE_REG(0xb474)
>> -#define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0)
>> -
>> #define XE2LPM_L3SQCREG5 XE_REG_MCR(0xb658)
>> #define XE2_TDF_CTRL XE_REG(0xb418)
>> diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
>> index f82b3e8ac5c8..313cc4242281 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.c
>> +++ b/drivers/gpu/drm/xe/xe_gt.c
>> @@ -110,7 +110,6 @@ static void xe_gt_enable_host_l2_vram(struct xe_gt
>> *gt)
>> return;
>> if (!xe_gt_is_media_type(gt)) {
>> - xe_mmio_write32(gt, SCRATCH1LPFC, EN_L3_RW_CCS_CACHE_FLUSH);
>> reg = xe_gt_mcr_unicast_read_any(gt, XE2_GAMREQSTRM_CTRL);
>> reg |= CG_DIS_CNTLBUS;
>> xe_gt_mcr_multicast_write(gt, XE2_GAMREQSTRM_CTRL, reg);
More information about the Intel-xe
mailing list