[PATCH 1/3] drm/xe: Use ordered wq for preempt fence waiting
Matthew Brost
matthew.brost at intel.com
Thu Mar 28 19:00:59 UTC 2024
On Thu, Mar 28, 2024 at 11:56:48AM -0700, Matt Roper wrote:
> On Thu, Mar 28, 2024 at 11:21:45AM -0700, Matthew Brost wrote:
> > Preempt fences can sleep waiting for an exec quuee suspend operation to
> > complete. Use a device private work queue to avoid hogging system
> > resources. Even though suspend operations can complete out-of-order, all
> > suspend operations within a VM need to complete before the preempt
> > rebind worker can start. With that, use a device private ordered wq for
> > preempt fence waiting.
> >
> > Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device.c | 7 ++++++-
> > drivers/gpu/drm/xe/xe_device_types.h | 3 +++
> > drivers/gpu/drm/xe/xe_preempt_fence.c | 2 +-
> > 3 files changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 01bd5ccf05ca..559bd72fde57 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -226,6 +226,9 @@ static void xe_device_destroy(struct drm_device *dev, void *dummy)
> > {
> > struct xe_device *xe = to_xe_device(dev);
> >
> > + if (xe->preempt_fence_wq)
> > + destroy_workqueue(xe->preempt_fence_wq);
> > +
> > if (xe->ordered_wq)
> > destroy_workqueue(xe->ordered_wq);
> >
> > @@ -291,9 +294,11 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> > INIT_LIST_HEAD(&xe->pinned.external_vram);
> > INIT_LIST_HEAD(&xe->pinned.evicted);
> >
> > + xe->preempt_fence_wq = alloc_ordered_workqueue("xe-preempt-fence-wq", 0);
> > xe->ordered_wq = alloc_ordered_workqueue("xe-ordered-wq", 0);
> > xe->unordered_wq = alloc_workqueue("xe-unordered-wq", 0, 0);
> > - if (!xe->ordered_wq || !xe->unordered_wq) {
> > + if (!xe->ordered_wq || !xe->unordered_wq ||
> > + !xe->preempt_fence_wq) {
> > drm_err(&xe->drm, "Failed to allocate xe workqueues\n");
> > err = -ENOMEM;
> > goto err;
>
> Not the fault of this patch, but in cases where some of the workqueues
> are allocated successfully, but at least one allocation fails, is the
> error handling in this function correct? It looks like it might be
> leaking the ones that were actually allocated? Same if
> xe_display_create() fails later in the function.
>
>
I did notice this as posting, put this one together late last not night
quickly. I suppose I can fix the work queue error handling in patch too.
Let's make sure we are that the actual fix is good first, then I'll
respin.
Matt B.
> Matt
>
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index 1df3dcc17d75..c710cec835a7 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -363,6 +363,9 @@ struct xe_device {
> > /** @ufence_wq: user fence wait queue */
> > wait_queue_head_t ufence_wq;
> >
> > + /** @preempt_fence_wq: used to serialize preempt fences */
> > + struct workqueue_struct *preempt_fence_wq;
> > +
> > /** @ordered_wq: used to serialize compute mode resume */
> > struct workqueue_struct *ordered_wq;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_preempt_fence.c b/drivers/gpu/drm/xe/xe_preempt_fence.c
> > index 7bce2a332603..7d50c6e89d8e 100644
> > --- a/drivers/gpu/drm/xe/xe_preempt_fence.c
> > +++ b/drivers/gpu/drm/xe/xe_preempt_fence.c
> > @@ -49,7 +49,7 @@ static bool preempt_fence_enable_signaling(struct dma_fence *fence)
> > struct xe_exec_queue *q = pfence->q;
> >
> > pfence->error = q->ops->suspend(q);
> > - queue_work(system_unbound_wq, &pfence->preempt_work);
> > + queue_work(q->vm->xe->preempt_fence_wq, &pfence->preempt_work);
> > return true;
> > }
> >
> > --
> > 2.34.1
> >
>
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation
More information about the Intel-xe
mailing list