[Intel-xe] [PATCH] drm/xe: Remove unnecessary memory barriers related to wait queue
Matthew Brost
matthew.brost at intel.com
Fri Jul 28 14:52:19 UTC 2023
On Fri, Jul 28, 2023 at 01:13:04PM +0100, Matthew Auld wrote:
> On Thu, 27 Jul 2023 at 19:34, Matthew Brost <matthew.brost at intel.com> wrote:
> >
> > These not needed per the wait queue doc, remove them.
> >
> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_ct.c | 3 ---
> > drivers/gpu/drm/xe/xe_guc_submit.c | 9 +--------
> > 2 files changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> > index cb75db30800c..3f58902b6be3 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> > @@ -308,7 +308,6 @@ int xe_guc_ct_enable(struct xe_guc_ct *ct)
> > spin_unlock_irq(&ct->fast_lock);
> > mutex_unlock(&ct->lock);
> >
> > - smp_mb();
> > wake_up_all(&ct->wq);
> > drm_dbg(&xe->drm, "GuC CT communication channel enabled\n");
> >
> > @@ -820,8 +819,6 @@ static int parse_g2h_response(struct xe_guc_ct *ct, u32 *msg, u32 len)
> > g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
> >
> > g2h_fence->done = true;
> > - smp_mb();
>
> Hmm, pretty sure we still need barriers such that done = 1 is seen to
> happen after the stores for fail, error, hint, retry etc, and then ofc
> a matching barrier on the reader side. The barriers in wake_up_all()
> vs wait_event() won't ensure that. So something like:
>
> parse_g2h_response():
> /*
> * Pairs with guc_ct_send_recv().
> *
> * The ordering of g2h_fence.done vs wake_up_all() and
> * wait_event_timeout() is already handled by that api. However here we
> * also have the added dependency of ordering the g2h_fence.done after
> * the previous stores, like with g2h_fence.retry, g2h_fence.fail etc.
> * which we need to handle ourselves.
> */
> smp_wmb();
> WRITE_ONCE(g2h_fence->done, true);
> wake_up_all(&ct->g2h_fence_wq);
>
>
> guc_ct_send_recv():
> ret = wait_event_timeout(ct->g2h_fence_wq, READ_ONCE(g2h_fence.done), HZ);
> if (!ret) {
> }
>
> /* Pairs with parse_g2h_response(). */
> smp_rmb();
> if (READ_ONCE(g2h_fence.retry)) {
> }
>
> if (READ_ONCE(g2h_fence.fail)) {
> }
>
> But AFAICT there also seems to be scary lifetime issues in here. The
> g2h_fence is allocated on the stack, and goes out of scope when
> returning from guc_ct_send_recv(), however if the wait_event timedout
> we don't know if parse_g2h_response() will still run before we remove
> the fence from the xarray. I think we need to grab the ct->mutex to
> serialize the operations. Also if we have to acquire ct->mutex anyway
> it might be worth seeing if we can drop the scary shared memory access
> outside locks stuff here.
>
Yea I think all of this is correct once I think about it.
> > -
> > wake_up_all(&ct->g2h_fence_wq);
> >
> > return 0;
> > diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> > index 6cb64a097297..d623cd9f6cc4 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> > @@ -705,7 +705,6 @@ static void disable_scheduling_deregister(struct xe_guc *guc,
> > int ret;
> >
> > set_min_preemption_timeout(guc, e);
> > - smp_rmb();
> > ret = wait_event_timeout(guc->ct.wq, !engine_pending_enable(e) ||
> > guc_read_stopped(guc), HZ * 5);
> > if (!ret) {
> > @@ -888,7 +887,6 @@ guc_engine_timedout_job(struct drm_sched_job *drm_job)
> > * error) messages which can cause the schedule disable to get
> > * lost. If this occurs, trigger a GT reset to recover.
> > */
> > - smp_rmb();
> > ret = wait_event_timeout(guc->ct.wq,
> > !engine_pending_disable(e) ||
> > guc_read_stopped(guc), HZ * 5);
> > @@ -1008,7 +1006,6 @@ static void suspend_fence_signal(struct xe_engine *e)
> > XE_BUG_ON(!e->guc->suspend_pending);
> >
> > e->guc->suspend_pending = false;
>
> Probably a good idea to use WRITE_ONCE/READ_ONCE. Even if it is not
> strictly needed from the compiler pov, it at the very least documents
> that we are accessing shared memory outside of proper locking.
>
Sure.
Matt
>
>
>
> > - smp_wmb();
> > wake_up(&e->guc->suspend_wait);
> > }
> >
> > @@ -1392,7 +1389,6 @@ int xe_guc_submit_reset_prepare(struct xe_guc *guc)
> > * and releasing any TDRs waiting on guc->submission_state.stopped.
> > */
> > ret = atomic_fetch_or(1, &guc->submission_state.stopped);
> > - smp_wmb();
> > wake_up_all(&guc->ct.wq);
> >
> > return ret;
> > @@ -1521,17 +1517,14 @@ int xe_guc_sched_done_handler(struct xe_guc *guc, u32 *msg, u32 len)
> > if (engine_pending_enable(e)) {
> > e->guc->resume_time = ktime_get();
> > clear_engine_pending_enable(e);
> > - smp_wmb();
> > wake_up_all(&guc->ct.wq);
> > } else {
> > clear_engine_pending_disable(e);
> > if (e->guc->suspend_pending) {
> > suspend_fence_signal(e);
> > } else {
> > - if (engine_banned(e)) {
> > - smp_wmb();
> > + if (engine_banned(e))
> > wake_up_all(&guc->ct.wq);
> > - }
> > deregister_engine(guc, e);
> > }
> > }
> > --
> > 2.34.1
> >
More information about the Intel-xe
mailing list