[PATCH] drm/xe: Invalidate L3 read-only cachelines for geometry streams too
Dong, Zhanjun
zhanjun.dong at intel.com
Thu Mar 27 23:49:22 UTC 2025
Hi Kenneth,
I'm trying to resend your patch from me to trigger the CI run,
meanwhile, I found your patch missed "Signed-off-by" tag, could you
resend with this tag? If CI still not run, I will resend your patch and try.
According to:
https://docs.kernel.org/process/5.Posting.html#before-creating-patches
Code without a proper signoff cannot be merged into the mainline.
Regards,
Zhanjun Dong
On 2025-03-27 6:39 p.m., Dong, Zhanjun wrote:
>
> On 2025-03-20 6:11 a.m., Kenneth Graunke wrote:
>> Historically, the Vertex Fetcher unit has not been an L3 client. That
>> meant that, when a buffer containing vertex data was written to, it was
>> necessary to issue a PIPE_CONTROL::VF Cache Invalidate to invalidate any
>> VF L2 cachelines associated with that buffer, so the new value would be
>> properly read from memory.
>>
>> Since Tigerlake and later, VERTEX_BUFFER_STATE and 3DSTATE_INDEX_BUFFER
>> have included an "L3 Bypass Enable" bit which userspace drivers can set
>> to request that the vertex fetcher unit snoop L3. However, unlike most
>> true L3 clients, the "VF Cache Invalidate" bit continues to only
>> invalidate the VF L2 cache - and not any associated L3 lines.
>>
>> To handle that, PIPE_CONTROL has a new "L3 Read Only Cache Invalidation
>> Bit", which according to the docs, "controls the invalidation of the
>> Geometry streams cached in L3 cache at the top of the pipe." In other
>> words, the vertex and index buffer data that gets cached in L3 when
>> "L3 Bypass Disable" is set.
>>
>> Mesa always sets L3 Bypass Disable so that the VF unit snoops L3, and
>> whenever it issues a VF Cache Invalidate, it also issues a L3 Read Only
>> Cache Invalidate so that both L2 and L3 vertex data is invalidated.
>>
>> xe is issuing VF cache invalidates too (which handles cases like CPU
>> writes to a buffer between GPU batches). Because userspace may enable
>> L3 snooping, it needs to issue an L3 Read Only Cache Invalidate as well.
>>
>> Fixes significant flickering in Firefox on Meteorlake, which was writing
>> to vertex buffers via the CPU between batches; the missing L3 Read Only
>> invalidates were causing the vertex fetcher to read stale data from L3.
>>
>> References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4460
>> Cc: stable at vger.kernel.org # v6.13+
>> ---
>> drivers/gpu/drm/xe/instructions/xe_gpu_commands.h | 1 +
>> drivers/gpu/drm/xe/xe_ring_ops.c | 13 +++++++++----
>> 2 files changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h b/
>> drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> index a255946b6f77e..8cfcd3360896c 100644
>> --- a/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> +++ b/drivers/gpu/drm/xe/instructions/xe_gpu_commands.h
>> @@ -41,6 +41,7 @@
>> #define GFX_OP_PIPE_CONTROL(len) ((0x3<<29)|(0x3<<27)|(0x2<<24)|
>> ((len)-2))
>> +#define PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE
>> BIT(10) /* gen12 */
>> #define PIPE_CONTROL0_HDC_PIPELINE_FLUSH BIT(9) /*
>> gen12 */
>> #define PIPE_CONTROL_COMMAND_CACHE_INVALIDATE (1<<29)
>> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/
>> xe_ring_ops.c
>> index 0c230ee53bba5..9d8901a33205a 100644
>> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
>> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>> @@ -141,7 +141,8 @@ emit_pipe_control(u32 *dw, int i, u32 bit_group_0,
>> u32 bit_group_1, u32 offset,
>> static int emit_pipe_invalidate(u32 mask_flags, bool invalidate_tlb,
>> u32 *dw,
>> int i)
>> {
>> - u32 flags = PIPE_CONTROL_CS_STALL |
>> + u32 flags0 = 0;
>> + u32 flags1 = PIPE_CONTROL_CS_STALL |
>> PIPE_CONTROL_COMMAND_CACHE_INVALIDATE |
>> PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE |
>> PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
>> @@ -152,11 +153,15 @@ static int emit_pipe_invalidate(u32 mask_flags,
>> bool invalidate_tlb, u32 *dw,
>> PIPE_CONTROL_STORE_DATA_INDEX;
>> if (invalidate_tlb)
>> - flags |= PIPE_CONTROL_TLB_INVALIDATE;
>> + flags1 |= PIPE_CONTROL_TLB_INVALIDATE;
>> - flags &= ~mask_flags;
>> + flags1 &= ~mask_flags;
>> - return emit_pipe_control(dw, i, 0, flags,
>> LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
>> + if (flags1 & PIPE_CONTROL_VF_CACHE_INVALIDATE)
>> + flags0 |= PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE;
>> +
>> + return emit_pipe_control(dw, i, flags0, flags1,
>> + LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR, 0);
> New PIPE_CONTROL0_L3_READ_ONLY_CACHE_INVALIDATE defined as spec
> documented.
> New flags0/1 handling looks good to me.
>
> For some reason this patch did not triggers automatic CI run:
>
> Address 'kenneth at whitecape.org' is not on the allowlist!
> Exception occurred during validation, bailing out!
>
> Let me check what we can do. CI run result is required before moving
> forward.
>
>> }
>> static int emit_store_imm_ppgtt_posted(u64 addr, u64 value,
>
More information about the Intel-xe
mailing list