[PATCH 1/7] drm/xe: Use ring ops TLB invalidation for rebinds

Thu Mar 21 21:21:07 UTC 2024

On Thu, 2024-03-21 at 22:14 +0100, Thomas Hellström wrote:
> Hi, Matthew,
> 
> Thanks for reviewing, please see inline.
> 
> On Thu, 2024-03-21 at 19:09 +0000, Matthew Brost wrote:
> > On Thu, Mar 21, 2024 at 12:37:11PM +0100, Thomas Hellström wrote:
> > > For each rebind we insert a GuC TLB invalidation and add a
> > > corresponding unordered TLB invalidation fence. This might
> > > add a huge number of TLB invalidation fences to wait for so
> > > rather than doing that, defer the TLB invalidation to the
> > > next ring ops for each affected exec queue. Since the TLB
> > > is invalidated on exec_queue switch, we need to invalidate
> > > once for each affected exec_queue.
> > > 
> > > Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after
> > > rebinds issued from execs")
> > > Cc: Matthew Brost <matthew.brost at intel.com>
> > > Cc: <stable at vger.kernel.org> # v6.8+
> > > Signed-off-by: Thomas Hellström
> > > <thomas.hellstrom at linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_exec_queue_types.h |  2 ++
> > >  drivers/gpu/drm/xe/xe_pt.c               |  5 +++--
> > >  drivers/gpu/drm/xe/xe_ring_ops.c         | 11 ++++-------
> > >  drivers/gpu/drm/xe/xe_sched_job.c        | 11 +++++++++++
> > >  drivers/gpu/drm/xe/xe_sched_job_types.h  |  2 ++
> > >  drivers/gpu/drm/xe/xe_vm_types.h         |  5 +++++
> > >  6 files changed, 27 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > index 62b3d9d1d7cd..891ad30e906f 100644
> > > --- a/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > +++ b/drivers/gpu/drm/xe/xe_exec_queue_types.h
> > > @@ -148,6 +148,8 @@ struct xe_exec_queue {
> > >  	const struct xe_ring_ops *ring_ops;
> > >  	/** @entity: DRM sched entity for this exec queue (1 to
> > > 1
> > > relationship) */
> > >  	struct drm_sched_entity *entity;
> > > +	/** @tlb_flush_seqno: The seqno of the last rebind tlb
> > > flush performed */
> > > +	u64 tlb_flush_seqno;
> > >  	/** @lrc: logical ring context for this exec queue */
> > >  	struct xe_lrc lrc[];
> > >  };
> > > diff --git a/drivers/gpu/drm/xe/xe_pt.c
> > > b/drivers/gpu/drm/xe/xe_pt.c
> > > index 8d3922d2206e..21bc0d13fccf 100644
> > > --- a/drivers/gpu/drm/xe/xe_pt.c
> > > +++ b/drivers/gpu/drm/xe/xe_pt.c
> > > @@ -1254,11 +1254,12 @@ __xe_pt_bind_vma(struct xe_tile *tile,
> > > struct xe_vma *vma, struct xe_exec_queue
> > >  	 * non-faulting LR, in particular on user-space batch
> > > buffer chaining,
> > >  	 * it needs to be done here.
> > >  	 */
> > > -	if ((rebind && !xe_vm_in_lr_mode(vm) && !vm-

While I remember it. When looking at your series I noticed that this
line got incorrectly moved there. Looks like you used
xe_vm_in_lr_mode() rather than !xe_vm_in_lr_mode()..

Thomas