[PATCH 1/3] drm/amdkfd: update parameter for event_interrupt_poison_consumption

Zhou1, Tao Tao.Zhou1 at amd.com
Tue Mar 15 07:10:56 UTC 2022


[AMD Official Use Only]



> -----Original Message-----
> From: Kuehling, Felix <Felix.Kuehling at amd.com>
> Sent: Tuesday, March 15, 2022 2:25 AM
> To: Zhou1, Tao <Tao.Zhou1 at amd.com>; amd-gfx at lists.freedesktop.org; Zhang,
> Hawking <Hawking.Zhang at amd.com>; Yang, Stanley
> <Stanley.Yang at amd.com>; Chai, Thomas <YiPeng.Chai at amd.com>
> Subject: Re: [PATCH 1/3] drm/amdkfd: update parameter for
> event_interrupt_poison_consumption
> 
> Am 2022-03-14 um 03:03 schrieb Tao Zhou:
> > Other parameters can be gotten from ih_ring_entry, so only inputting
> > ih_ring_entry is enough.
> 
> I'm not sure what's the reason for this change. You remove one parameter, but
> end up duplicating the SOC15_..._FROM_IH_RING_ENTRY translations. It
> doesn't look like a net improvement to me.

[Tao] source_id/pasid/client_id will be transferred and I'd like to reduce the number of parameters, I'll drop the change.

> 
> Looking at this function a bit more, this code looks problematic:
> 
>          if (atomic_read(&p->poison)) {
>                  kfd_unref_process(p);
>                  return;
>          }
> 
>          atomic_set(&p->poison, 1);
>          kfd_unref_process(p);
> 
> Doing the read and set as two separate operations is not atomic. You should use
> atomic_cmpxchg here to make sure the poison-consumption is handled only
> once:
> 
> 	old_poison = atomic_cmpxchg(&p->poison, 0, 1);
> 	kfd_unref_process(p);
> 	if (old_poison)
> 		return;
> 	/* handle poison consumption */
> 
> Alternatively you could use atomic_inc_return and do the poison handling only if
> that returns exactly 1.

[Tao] thanks, accepted.

> 
> Regards,
>    Felix
> 
> 
> >
> > Signed-off-by: Tao Zhou <tao.zhou1 at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
> > index 7eedbcd14828..f7def0bf0730 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
> > @@ -91,11 +91,16 @@ enum SQ_INTERRUPT_ERROR_TYPE {
> >   #define KFD_SQ_INT_DATA__ERR_TYPE__SHIFT 20
> >
> >   static void event_interrupt_poison_consumption(struct kfd_dev *dev,
> > -				uint16_t pasid, uint16_t source_id)
> > +				const uint32_t *ih_ring_entry)
> >   {
> > +	uint16_t source_id, pasid;
> >   	int ret = -EINVAL;
> > -	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
> > +	struct kfd_process *p;
> >
> > +	source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry);
> > +	pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
> > +
> > +	p = kfd_lookup_process_by_pasid(pasid);
> >   	if (!p)
> >   		return;
> >
> > @@ -270,7 +275,7 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev,
> >   					sq_intr_err);
> >   				if (sq_intr_err !=
> SQ_INTERRUPT_ERROR_TYPE_ILLEGAL_INST &&
> >   					sq_intr_err !=
> SQ_INTERRUPT_ERROR_TYPE_MEMVIOL) {
> > -
> 	event_interrupt_poison_consumption(dev, pasid, source_id);
> > +
> 	event_interrupt_poison_consumption(dev, ih_ring_entry);
> >   					return;
> >   				}
> >   				break;
> > @@ -291,7 +296,7 @@ static void event_interrupt_wq_v9(struct kfd_dev *dev,
> >   		if (source_id == SOC15_INTSRC_SDMA_TRAP) {
> >   			kfd_signal_event_interrupt(pasid, context_id0 &
> 0xfffffff, 28);
> >   		} else if (source_id == SOC15_INTSRC_SDMA_ECC) {
> > -			event_interrupt_poison_consumption(dev, pasid,
> source_id);
> > +			event_interrupt_poison_consumption(dev,
> ih_ring_entry);
> >   			return;
> >   		}
> >   	} else if (client_id == SOC15_IH_CLIENTID_VMC ||


More information about the amd-gfx mailing list