[PATCH] drm/xe: Invalidate L3 read-only cachelines for geometry streams too

Dong, Zhanjun zhanjun.dong at intel.com
Thu Mar 27 23:49:22 UTC 2025


Hi Kenneth,

I'm trying to resend your patch from me to trigger the CI run, 
meanwhile, I found your patch missed "Signed-off-by" tag, could you 
resend with this tag? If CI still not run, I will resend your patch and try.

According to:
https://docs.kernel.org/process/5.Posting.html#before-creating-patches

Code without a proper signoff cannot be merged into the mainline.

Regards,
Zhanjun Dong

On 2025-03-27 6:39 p.m., Dong, Zhanjun wrote:
> 
> On 2025-03-20 6:11 a.m., Kenneth Graunke wrote:
>> Historically, the Vertex Fetcher unit has not been an L3 client.  That
>> meant that, when a buffer containing vertex data was written to, it was
>> necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any
>> VF L2 cachelines associated with that buffer, so the new value would be
>> properly read from memory.
>>
>> Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER
>> have included an "L3 Bypass Enable" bit which userspace drivers can set
>> to request that the vertex fetcher unit snoop L3.  However, unlike most
>> true L3 clients, the "VF Cache Invalidate" bit continues to only
>> invalidate the VF L2 cache - and not any associated L3 lines.
>>
>> To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation
>> Bit", which according to the docs, "controls the invalidation of the
>> Geometry streams cached in L3 cache at the top of the pipe."  In other
>> words, the vertex and index buffer data that gets cached in L3 when
>> "L3 Bypass Disable" is set.
>>
>> Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and
>> whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only
>> Cache Invalidate so that both L2 and L3 vertex data is invalidated.
>>
>> xe is issuing VF cache invalidates too (which handles cases like CPU
>> writes to a buffer between GPU batches).  Because userspace may enable
>> L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well.
>>
>> Fixes significant flickering in Firefox on Meteorlake, which was writing
>> to vertex buffers via the CPU between batches; the missing L3 Read Only
>> invalidates were causing the vertex fetcher to read stale data from L3.
>>
>> References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460
>> Cc: stable at vger.kernel.org # v6.13+
>> ---
>>   drivers/gpu/drm/xe/instructions/xe_gpu_commands.h |  1 +
>>   drivers/gpu/drm/xe/xe_ring_ops.c                  | 13 +++++++++----
>>   2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h b/ 
>> drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> index a255946b6f77e..8cfcd3360896c 100644
>> --- a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> +++ b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> @@ -41,6 +41,7 @@
>>   #define GFX_OP_PIPE_CONTROL(len)    ((0x3<<29)|(0x3<<27)|(0x2<<24)| 
>> ((len)-2))
>> +#define      PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE    
>> BIT(10)    /* gen12 */
>>   #define      PIPE_CONTROL0_HDC_PIPELINE_FLUSH        BIT(9)    /* 
>> gen12 */
>>   #define   PIPE_CONTROL_COMMAND_CACHE_INVALIDATE        (1<<29)
>> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/ 
>> xe_ring_ops.c
>> index 0c230ee53bba5..9d8901a33205a 100644
>> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
>> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>> @@ -141,7 +141,8 @@ emit_pipe_control(u32 *dw, int i, u32 bit_group_0, 
>> u32 bit_group_1, u32 offset,
>>   static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb, 
>> u32 *dw,
>>                   int i)
>>   {
>> -    u32 flags = PIPE_CONTROL_CS_STALL |
>> +    u32 flags0 = 0;
>> +    u32 flags1 = PIPE_CONTROL_CS_STALL |
>>           PIPE_CONTROL_COMMAND_CACHE_INVALIDATE |
>>           PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE |
>>           PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
>> @@ -152,11 +153,15 @@ static int emit_pipe_invalidate(u32 mask_flags, 
>> bool invalidate_tlb, u32 *dw,
>>           PIPE_CONTROL_STORE_DATA_INDEX;
>>       if (invalidate_tlb)
>> -        flags |= PIPE_CONTROL_TLB_INVALIDATE;
>> +        flags1 |= PIPE_CONTROL_TLB_INVALIDATE;
>> -    flags &= ~mask_flags;
>> +    flags1 &= ~mask_flags;
>> -    return emit_pipe_control(dw, i, 0, flags, 
>> LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
>> +    if (flags1 & PIPE_CONTROL_VF_CACHE_INVALIDATE)
>> +        flags0 |= PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE;
>> +
>> +    return emit_pipe_control(dw, i, flags0, flags1,
>> +                 LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
> New PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE    defined as spec 
> documented.
> New flags0/1 handling looks good to me.
> 
> For some reason this patch did not triggers automatic CI run:
> 
> Address 'kenneth at whitecape.org' is not on the allowlist!
> Exception occurred during validation, bailing out!
> 
> Let me check what we can do. CI run result is required before moving 
> forward.
> 
>>   }
>>   static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
> 



More information about the Intel-xe mailing list