[PATCH] drm/amdgpu: add ih waiter on process until checkpoint

Fri Mar 5 21:34:42 UTC 2021

[AMD Official Use Only - Internal Distribution Only]

> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> Sent: Friday, March 5, 2021 3:18 PM
> To: Kim, Jonathan <Jonathan.Kim at amd.com>; amd-
> gfx at lists.freedesktop.org
> Cc: Yang, Philip <Philip.Yang at amd.com>; Kuehling, Felix
> <Felix.Kuehling at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: add ih waiter on process until checkpoint
>
> [CAUTION: External Email]
>
> Am 05.03.21 um 21:11 schrieb Jonathan Kim:
> > Add IH function to allow caller to wait until ring entries are
> > processed until the checkpoint write pointer.
> >
> > This will be primarily used by HMM to drain pending page fault
> > interrupts before memory unmap to prevent HMM from handling stale
> interrupts.
> >
> > v2: Update title and description to clarify use.
> > Add rptr/wptr wrap counter checks to guarantee ring entries are
> > processed until the checkpoint.
>
> First of all as I said please use a wait_event, busy waiting is a clear NAK.

Who would do the wake though?  Are you suggesting wake be done in amdgpu_ih_process?  Or is waiting happening by the caller and this should go somewhere higher (like amdgpu_amdkfd for example)?

>
> >
> > Signed-off-by: Jonathan Kim <jonathan.kim at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 68
> +++++++++++++++++++++++++-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h |  2 +
> >   2 files changed, 69 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > index dc852af4f3b7..954518b4fb79 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > @@ -22,7 +22,7 @@
> >    */
> >
> >   #include <linux/dma-mapping.h>
> > -
> > +#include <linux/processor.h>
> >   #include "amdgpu.h"
> >   #include "amdgpu_ih.h"
> >
> > @@ -160,6 +160,72 @@ void amdgpu_ih_ring_write(struct
> amdgpu_ih_ring *ih, const uint32_t *iv,
> >       }
> >   }
> >
> > +/**
> > + * amdgpu_ih_wait_on_checkpoint_process - wait to process IVs up to
> > +checkpoint
> > + *
> > + * @adev: amdgpu_device pointer
> > + * @ih: ih ring to process
> > + *
> > + * Used to ensure ring has processed IVs up to the checkpoint write
> pointer.
> > + */
> > +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device
> *adev,
> > +                                     struct amdgpu_ih_ring *ih) {
> > +     u64 rptr_check, wptr_check, rptr_wrap = 0, wptr_wrap = 0;
> > +     u32 prev_rptr, prev_wptr;
> > +
> > +     if (!ih->enabled || adev->shutdown)
> > +             return -ENODEV;
> > +
> > +     prev_wptr = amdgpu_ih_get_wptr(adev, ih);
> > +     /* Order wptr with rptr. */
> > +     rmb();
> > +     prev_rptr = READ_ONCE(ih->rptr);
> > +     rptr_check = prev_rptr | (rptr_wrap << 32);
> > +     wptr_check = prev_wptr | (wptr_wrap << 32);
>
> Hui what? That check doesn't even make remotely sense to me.

Can you clarify what you meant by creating a new 64 bit compare?
Snip from your last response:

"This way you do something like this:
1. get the wrap around counter.
2. get wptr
3. get rptr
4. compare the wrap around counter with the old one, if it has changed start over with #1
5. Use wrap around counter and rptr/wptr to come up with 64bit values.
6. Compare wptr with rptr/wrap around counter until we are sure the IHs are processed."

From a discussion with Felix, I interpreted this as a way to guarantee rptr/wtpr ordering so that rptr monotonically follows wptr per check.
I'm assuming rptr/wptrs are 32 bits wide by the use of ptr_mask on read/write functions so a respective mask of rptr/wptr wrap count to the top 32 bits would mark how far apart the rptr and wptr are per check.

Thanks,

Jon

>
> Christian.
>
> > +
> > +     spin_begin();
> > +     while (true) {
> > +             bool rptr_wrapped = false, wptr_wrapped = false;
> > +             u32 rptr, wptr;
> > +
> > +             spin_cpu_relax();
> > +
> > +             /* Update wptr checkpoint/wrap count compare. */
> > +             wptr = amdgpu_ih_get_wptr(adev, ih);
> > +             if (wptr < prev_wptr) {
> > +                     wptr_wrap++;
> > +                     wptr_check = wptr | (wptr_wrap << 32);
> > +                     wptr_wrapped = true;
> > +             }
> > +             prev_wptr = wptr;
> > +
> > +             /* Order wptr with rptr. */
> > +             rmb();
> > +
> > +             /* Update rptr/wrap count compare. */
> > +             rptr = READ_ONCE(ih->rptr);
> > +             if (rptr < prev_rptr) {
> > +                     rptr_wrap++;
> > +                     rptr_wrapped = true;
> > +             }
> > +             rptr_check = rptr | (rptr_wrap << 32);
> > +             prev_rptr = rptr;
> > +
> > +             /* Wrapping occurred so restart. */
> > +             if (rptr_wrapped || wptr_wrapped)
> > +                     continue;
> > +
> > +             /* Exit on reaching or passing checkpoint. */
> > +             if (rptr_check >= wptr_check &&
> > +                                     rptr >= (wptr_check & ih->ptr_mask))
> > +                     break;
> > +     }
> > +     spin_end();
> > +
> > +     return 0;
> > +}
> > +
> >   /**
> >    * amdgpu_ih_process - interrupt handler
> >    *
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > index 6ed4a85fc7c3..6817f0a812d2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > @@ -87,6 +87,8 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev,
> struct amdgpu_ih_ring *ih,
> >   void amdgpu_ih_ring_fini(struct amdgpu_device *adev, struct
> amdgpu_ih_ring *ih);
> >   void amdgpu_ih_ring_write(struct amdgpu_ih_ring *ih, const uint32_t *iv,
> >                         unsigned int num_dw);
> > +int amdgpu_ih_wait_on_checkpoint_process(struct amdgpu_device
> *adev,
> > +                                     struct amdgpu_ih_ring *ih);
> >   int amdgpu_ih_process(struct amdgpu_device *adev, struct
> amdgpu_ih_ring *ih);
> >   void amdgpu_ih_decode_iv_helper(struct amdgpu_device *adev,
> >                               struct amdgpu_ih_ring *ih,