[Intel-xe] [PATCH 2/3] drm/xe: Add missing TLB invalidation to emit_pipe_invalidate()

Souza, Jose jose.souza at intel.com
Wed May 31 19:12:02 UTC 2023


On Wed, 2023-05-31 at 10:46 +0200, Thomas Hellström wrote:
> On 5/30/23 20:40, Souza, Jose wrote:
> > On Mon, 2023-05-29 at 17:07 +0200, Thomas Hellström wrote:
> > > On 5/29/23 16:56, Thomas Hellström wrote:
> > > > On 5/29/23 16:48, Souza, Jose wrote:
> > > > > On Mon, 2023-05-29 at 11:08 +0200, Thomas Hellström wrote:
> > > > > > On Fri, 2023-05-26 at 12:06 -0700, José Roberto de Souza wrote:
> > > > > > > i915 invalidates TLB before emit BB start, so doing the same in Xe.
> > > > > > Hi, José,
> > > > > > 
> > > > > > Do you see an issue because of missing TLB flushes? In that case that
> > > > > > needs to be added to the commit message. We do TLB flushes on unbind,
> > > > > > but not sure if we do them on rebind, so if that's the issue we need to
> > > > > > figure out whether we should do them also on rebind or, like in this
> > > > > > patch, on each exec.
> > > > > I have a group of tests that results flips randomly. It fails when
> > > > > the rendered buffer is compared to expected result.
> > > > > Anything that add a bit of delay after the exec fixes those tests so
> > > > > I was looking for any missing flush in Xe KMD and Mesa.
> > > > > 
> > > > > This one did not fixed it but as i915 was doing it I thought would be
> > > > > good to do in Xe too.
> > > > I think missing TLB invalidations are more likely to cause random
> > > > overwrites of freed memory. Let me do a quick check on these. But the
> > > > problem you're describing indeed sounds more like a missing render
> > > > cache flush.
> > > > 
> > > It indeed looks like we're doing proper TLB invalidation on both rebind
> > > and unbind, so this patch shouldn't really be needed.
> > > 
> > > (look for "invalidation_fence_init()")
> > With this patch + PIPE_CONTROL flush at the end of batch buffer in Mesa, fixed the groups of the tests that were flipping results.
> > Do a XE_GUC_ACTION_TLB_INVALIDATION is the same as a PIPE_CONTROL_TLB_INVALIDATE?
> 
> I would think so, yes, except that the GUC TLB invalidation is GT-wide 
> and I'm not sure whether PIPE_CONTROL_TLB_INVALIDATE is per hw engine or 
> per-GT.

All other PIPE_CONTROL bits are per engine so PIPE_CONTROL_TLB_INVALIDATE should also be.

> But given this really has an impact, it might be that we need to 
> invalidate TLB also after a bind where we previously pointed to the 
> scratch page.
> 
> If that's indeed the culprit we should look at issuing a 
> PIPE_CONTROL_TLB_INVALIDATE on the exec following a bind or rebind, and 
> leave the GuC TLB invalidations for unbinds, and then this patch makes 
> sense as a start.

this is beyond my Xe KMD understating, feel free to write it 😄

> 
> 
> > 
> > Do you see any PIPE_CONTROL flush at the end of batch buffers that i915 does but Xe don't?
> 
> The emit_fini_breadcrumb() called from __i915_request_submit() indeed 
> seems to emit the flushes needed, whereas the corresponding
> emit_pipe_imm_ggtt() in xe doesn't.

Huum!
What do you think? Xe should do it like i915 or UMDs should do it?

> 
> /Thomas
> 
> 
> > 
> > > /Thomas
> > > 
> > > > /Thomas
> > > > 
> > > > > > Thanks,
> > > > > > Thomas
> > > > > > 
> > > > > > 
> > > > > > > Signed-off-by: José Roberto de Souza <jose.souza at intel.com>
> > > > > > > ---
> > > > > > >    drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 1 +
> > > > > > >    drivers/gpu/drm/xe/xe_ring_ops.c          | 6 ++++--
> > > > > > >    2 files changed, 5 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
> > > > > > > b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
> > > > > > > index 0f9c5b0b8a3ba..7c7320efea739 100644
> > > > > > > --- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
> > > > > > > +++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
> > > > > > > @@ -73,6 +73,7 @@
> > > > > > >    #define
> > > > > > > PIPE_CONTROL_STORE_DATA_INDEX                        (1<<21)
> > > > > > >    #define
> > > > > > > PIPE_CONTROL_CS_STALL                                (1<<20)
> > > > > > >    #define PIPE_CONTROL_GLOBAL_SNAPSHOT_RESET           (1<<19)
> > > > > > > +#define PIPE_CONTROL_TLB_INVALIDATE                  (1<<18)
> > > > > > >    #define
> > > > > > > PIPE_CONTROL_PSD_SYNC                                (1<<17)
> > > > > > >    #define
> > > > > > > PIPE_CONTROL_QW_WRITE                                (1<<14)
> > > > > > >    #define PIPE_CONTROL_DEPTH_STALL                     (1<<13)
> > > > > > > diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > index d2fa0b4c8bcc0..4f3ef39109b9b 100644
> > > > > > > --- a/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
> > > > > > > @@ -37,7 +37,8 @@
> > > > > > >                   PIPE_CONTROL_INDIRECT_STATE_DISABLE | \
> > > > > > >                   PIPE_CONTROL_FLUSH_ENABLE | \
> > > > > > >                   PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
> > > > > > > -               PIPE_CONTROL_DC_FLUSH_ENABLE)
> > > > > > > +               PIPE_CONTROL_DC_FLUSH_ENABLE | \
> > > > > > > +               PIPE_CONTROL_TLB_INVALIDATE)
> > > > > > >      static u32 preparser_disable(bool state)
> > > > > > >    {
> > > > > > > @@ -117,7 +118,8 @@ static int emit_pipe_invalidate(u32 mask_flags,
> > > > > > > u32 *dw, int i)
> > > > > > >                   PIPE_CONTROL_CONST_CACHE_INVALIDATE |
> > > > > > >                   PIPE_CONTROL_STATE_CACHE_INVALIDATE |
> > > > > > >                   PIPE_CONTROL_QW_WRITE |
> > > > > > > -               PIPE_CONTROL_STORE_DATA_INDEX;
> > > > > > > +               PIPE_CONTROL_STORE_DATA_INDEX |
> > > > > > > +               PIPE_CONTROL_TLB_INVALIDATE;
> > > > > > >             flags &= ~mask_flags;



More information about the Intel-xe mailing list