[PATCH v4 2/3] drm/xe/xe_guc_submit: Allow lr exec queues to be banned

Summers, Stuart stuart.summers at intel.com
Mon Apr 15 18:20:09 UTC 2024


On Fri, 2024-04-12 at 23:45 +0000, Matthew Brost wrote:
> On Fri, Apr 12, 2024 at 02:58:21PM -0600, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Summers, Stuart <stuart.summers at intel.com> 
> > Sent: Friday, April 12, 2024 1:49 PM
> > To: intel-xe at lists.freedesktop.org; Cavitt, Jonathan
> > <jonathan.cavitt at intel.com>
> > Cc: Brost, Matthew <matthew.brost at intel.com>; Harrison, John C
> > <john.c.harrison at intel.com>; Gupta, saurabhg
> > <saurabhg.gupta at intel.com>; De Marchi, Lucas
> > <lucas.demarchi at intel.com>
> > Subject: Re: [PATCH v4 2/3] drm/xe/xe_guc_submit: Allow lr exec
> > queues to be banned
> > > 
> > > On Fri, 2024-04-05 at 10:55 -0700, Jonathan Cavitt wrote:
> > > > LR queues currently don't get banned during a GT/GuC reset
> > > > because
> > > > they
> > > > lack a job.  Though they don't have a job to detect the reset
> > > > status
> > > > of,
> > > > it's still possible to tell when they should be banned by
> > > > looking at
> > > > the
> > > > LRC: if the LRC head and tail don't match, then the exec queue
> > > > should
> > > > be
> > > > banned and cleaned up.
> > > > 
> > > > This also requires swapping the usage of xe_sched_tdr_queue_imm
> > > > with
> > > > xe_guc_exec_queue_trigger_cleanup, as the former is specific to
> > > > non-
> > > > lr
> > > > exec queues.
> > > > 
> > > > Suggested-by: Matthew Brost <matthew.brost at intel.com>
> > > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
> > > > Reviewed-by: Matthew Brost <matthew.brost at intel.com>
> > > > ---
> > > > 
> > > > v2:
> > > > - Fix Subject line
> > > > - Modify change slightly to remove need for "ban" boolean
> > > > 
> > > > v3: Revert change involving "ban" boolean to version 1
> > > > 
> > > > v4: Add missing semicolon and remove whitespace
> > > > 
> > > >  drivers/gpu/drm/xe/xe_guc_submit.c | 12 ++++++++++--
> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > index 1a6abb10a960e..e72f2a6cad60a 100644
> > > > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > > > @@ -1424,15 +1424,23 @@ static void guc_exec_queue_stop(struct
> > > > xe_guc
> > > > *guc, struct xe_exec_queue *q)
> > > >          */
> > > >         if (!(q->flags & (EXEC_QUEUE_FLAG_KERNEL |
> > > > EXEC_QUEUE_FLAG_VM))) {
> > > >                 struct xe_sched_job *job =
> > > > xe_sched_first_pending_job(sched);
> > > > +               bool ban = false;
> > > >  
> > > >                 if (job) {
> > > >                         if ((xe_sched_job_started(job) &&
> > > >                             !xe_sched_job_completed(job)) ||
> > > >                             xe_sched_invalidate_job(job, 2)) {
> > > >                                 trace_xe_sched_job_ban(job);
> > > > -                               set_exec_queue_banned(q);
> > > > -                               xe_sched_tdr_queue_imm(&q->guc-
> > > > > sched);
> > > > +                               ban = true;
> > > >                         }
> > > > +               } else if (xe_exec_queue_is_lr(q) &&
> > > > +                          (xe_lrc_ring_head(q->lrc) != q->lrc-
> > > > > ring.tail)) {
> > > 
> > > Why do you read the head out of the lrc but the tail from our
> > > internal
> > > variable? Isn't there a small chance here that you could get
> > > something
> > > ready to send but not quite submitted to GuC and in which case
> > > these
> > > would not be equal but not necessarily need the ban? I guess the
> > > flip
> > > side is maybe that doesn't actually have any real functional
> > > impact...
> > 
> 
> LR jobs don't have in deps, i.e. they run immediately. I guess there
> is
> very small window between exec IOCTL moving the tail and actually
> hitting the hardware. 
> 
> > 
> > IIRC it's because no equivalent xe_lrc_ring_head function exists to
> > read the lrc ring tail.  It's likely that I just missed the
> > function to read
> > the tail, so if you know what the function is, I'll replace the
> > internal
> > call with the proper functional call for the next revision.
> > 
> > Alternatively, should I use the internal variable for both halves
> > of
> > the comparison?
> 
> No, if anything check both values in memory.
> 
> That being said, I think this is fine as is due the nature of LRC
> jobs but I guess it would not hurt to read tail from memory.
> 
> If Stuart insists, we change this.

So honestly, I don't think it's going to matter much functionally like
I had mentioned. That said, from a code inspection standpoint it does
look a little weird. IMO we should either read both out of memory or
add a quick comment like what Matt said - that the time of reading out
of the LRC and the drivers memory is small enough that for the LR
workload here we will never really functionally see any difference.

Thanks,
Stuart

> 
> Matt
> 
> > -Jonathan Cavitt
> > 
> > 
> > > 
> > > Thanks,
> > > Stuart
> > > 
> > > > +                       ban = true;
> > > > +               }
> > > > +
> > > > +               if (ban) {
> > > > +                       set_exec_queue_banned(q);
> > > > +                       xe_guc_exec_queue_trigger_cleanup(q);
> > > >                 }
> > > >         }
> > > >  }
> > > 
> > > 



More information about the Intel-xe mailing list