[PATCH 02/11] drm/amdgpu: send IVs to the KFD only after processing them v2

Kuehling, Felix Felix.Kuehling at amd.com
Mon Dec 3 16:31:44 UTC 2018


On 2018-12-01 9:11 a.m., Christian König wrote:
>> Won't this break VM fault handling in KFD?
> No, we still send all VM faults to KFD after processing them. Only
> filtered retries are not send to the KFD any more.

OK, I missed that src->funcs->process returning 0 means "not handled",
>0 means "handled". Currently I don't see any interrupt processing
callbacks returning >0. I think that gets added in patch 4.


>
>> As far as I can tell, the only code path that leave IRQs unhandled
>> and passes them to KFD prints an error message in the kernel log. We
>> can't have the kernel log flooded with error messages every time
>> there are IRQs for KFD. We can get extremely high frequency
>> interrupts for HSA signals.
> Since the KFD didn't filtered the faults this would have a been a
> problem before as well.

I missed that r == 0 means not handled without being an error.


>
> So I'm pretty sure that we already have registered handlers for all
> interrupts the KFD is interested in as well.

No. As far as I can tell, you're missing these two:

GFX_9_0__SRCID__CP_BAD_OPCODE_ERROR (183)
GFX_9_0__SRCID__SQ_INTERRUPT_ID (239)

239 is used for signaling events from shaders and can be very frequent.
Triggering an error message on those interrupts would be bad.

Regards,
  Felix


>
> Regards,
> Christian.
>
> Am 30.11.18 um 17:31 schrieb Kuehling, Felix:
>> Won't this break VM fault handling in KFD? I don't see a way with the
>> current code that you can leave some VM faults for KFD to process. If
>> we could consider VM faults with VMIDs 8-15 as not handled in amdgpu
>> and leave them for KFD to process, then this could work.
>>
>> As far as I can tell, the only code path that leave IRQs unhandled
>> and passes them to KFD prints an error message in the kernel log. We
>> can't have the kernel log flooded with error messages every time
>> there are IRQs for KFD. We can get extremely high frequency
>> interrupts for HSA signals.
>>
>> Regards,
>>    Felix
>>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>> Alex Deucher
>> Sent: Friday, November 30, 2018 10:03 AM
>> To: Christian König <ckoenig.leichtzumerken at gmail.com>
>> Cc: amd-gfx list <amd-gfx at lists.freedesktop.org>
>> Subject: Re: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after
>> processing them v2
>>
>> On Fri, Nov 30, 2018 at 7:36 AM Christian König
>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>> This allows us to filter out VM faults in the GMC code.
>>>
>>> v2: don't filter out all faults
>>>
>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> Acked-by: Alex Deucher <alexander.deucher at amd.com>
>>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 29
>>> +++++++++++++++----------
>>>   1 file changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> index 6b6524f04ce0..6db4c58ddc13 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> @@ -149,9 +149,6 @@ static void amdgpu_irq_callback(struct
>>> amdgpu_device *adev,
>>>          if (!amdgpu_ih_prescreen_iv(adev))
>>>                  return;
>>>
>>> -       /* Before dispatching irq to IP blocks, send it to amdkfd */
>>> -       amdgpu_amdkfd_interrupt(adev, (const void *)
>>> &ih->ring[ring_index]);
>>> -
>>>          entry.iv_entry = (const uint32_t *)&ih->ring[ring_index];
>>>          amdgpu_ih_decode_iv(adev, &entry);
>>>
>>> @@ -371,29 +368,31 @@ void amdgpu_irq_dispatch(struct amdgpu_device
>>> *adev,
>>>          unsigned client_id = entry->client_id;
>>>          unsigned src_id = entry->src_id;
>>>          struct amdgpu_irq_src *src;
>>> +       bool handled = false;
>>>          int r;
>>>
>>>          trace_amdgpu_iv(entry);
>>>
>>>          if (client_id >= AMDGPU_IRQ_CLIENTID_MAX) {
>>> -               DRM_DEBUG("Invalid client_id in IV: %d\n", client_id);
>>> +               DRM_ERROR("Invalid client_id in IV: %d\n", client_id);
>>>                  return;
>>>          }
>>>
>>>          if (src_id >= AMDGPU_MAX_IRQ_SRC_ID) {
>>> -               DRM_DEBUG("Invalid src_id in IV: %d\n", src_id);
>>> +               DRM_ERROR("Invalid src_id in IV: %d\n", src_id);
>>>                  return;
>>>          }
>>>
>>>          if (adev->irq.virq[src_id]) {
>>>                 
>>> generic_handle_irq(irq_find_mapping(adev->irq.domain, src_id));
>>> -       } else {
>>> -               if (!adev->irq.client[client_id].sources) {
>>> -                       DRM_DEBUG("Unregistered interrupt client_id:
>>> %d src_id: %d\n",
>>> -                                 client_id, src_id);
>>> -                       return;
>>> -               }
>>> +               return;
>>> +       }
>>>
>>> +       if (!adev->irq.client[client_id].sources) {
>>> +               DRM_DEBUG("Unregistered interrupt client_id: %d
>>> src_id: %d\n",
>>> +                         client_id, src_id);
>>> +               return;
>>> +       } else {
>>>                  src = adev->irq.client[client_id].sources[src_id];
>>>                  if (!src) {
>>>                          DRM_DEBUG("Unhandled interrupt src_id: %d\n",
>>> src_id); @@ -401,9 +400,15 @@ void amdgpu_irq_dispatch(struct
>>> amdgpu_device *adev,
>>>                  }
>>>
>>>                  r = src->funcs->process(adev, src, entry);
>>> -               if (r)
>>> +               if (r < 0)
>>>                          DRM_ERROR("error processing interrupt (%d)\n",
>>> r);
>>> +               else if (r)
>>> +                       handled = true;
>>>          }
>>> +
>>> +       /* Send it to amdkfd as well if it isn't already handled */
>>> +       if (!handled)
>>> +               amdgpu_amdkfd_interrupt(adev, entry->iv_entry);
>>>   }
>>>
>>>   /**
>>> -- 
>>> 2.17.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>


More information about the amd-gfx mailing list