[PATCH 02/11] drm/amdgpu: send IVs to the KFD only after processing them v2
Kuehling, Felix
Felix.Kuehling at amd.com
Mon Dec 3 16:31:44 UTC 2018
On 2018-12-01 9:11 a.m., Christian König wrote:
>> Won't this break VM fault handling in KFD?
> No, we still send all VM faults to KFD after processing them. Only
> filtered retries are not send to the KFD any more.
OK, I missed that src->funcs->process returning 0 means "not handled",
>0 means "handled". Currently I don't see any interrupt processing
callbacks returning >0. I think that gets added in patch 4.
>
>> As far as I can tell, the only code path that leave IRQs unhandled
>> and passes them to KFD prints an error message in the kernel log. We
>> can't have the kernel log flooded with error messages every time
>> there are IRQs for KFD. We can get extremely high frequency
>> interrupts for HSA signals.
> Since the KFD didn't filtered the faults this would have a been a
> problem before as well.
I missed that r == 0 means not handled without being an error.
>
> So I'm pretty sure that we already have registered handlers for all
> interrupts the KFD is interested in as well.
No. As far as I can tell, you're missing these two:
GFX_9_0__SRCID__CP_BAD_OPCODE_ERROR (183)
GFX_9_0__SRCID__SQ_INTERRUPT_ID (239)
239 is used for signaling events from shaders and can be very frequent.
Triggering an error message on those interrupts would be bad.
Regards,
Felix
>
> Regards,
> Christian.
>
> Am 30.11.18 um 17:31 schrieb Kuehling, Felix:
>> Won't this break VM fault handling in KFD? I don't see a way with the
>> current code that you can leave some VM faults for KFD to process. If
>> we could consider VM faults with VMIDs 8-15 as not handled in amdgpu
>> and leave them for KFD to process, then this could work.
>>
>> As far as I can tell, the only code path that leave IRQs unhandled
>> and passes them to KFD prints an error message in the kernel log. We
>> can't have the kernel log flooded with error messages every time
>> there are IRQs for KFD. We can get extremely high frequency
>> interrupts for HSA signals.
>>
>> Regards,
>> Felix
>>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>> Alex Deucher
>> Sent: Friday, November 30, 2018 10:03 AM
>> To: Christian König <ckoenig.leichtzumerken at gmail.com>
>> Cc: amd-gfx list <amd-gfx at lists.freedesktop.org>
>> Subject: Re: [PATCH 02/11] drm/amdgpu: send IVs to the KFD only after
>> processing them v2
>>
>> On Fri, Nov 30, 2018 at 7:36 AM Christian König
>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>> This allows us to filter out VM faults in the GMC code.
>>>
>>> v2: don't filter out all faults
>>>
>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> Acked-by: Alex Deucher <alexander.deucher at amd.com>
>>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 29
>>> +++++++++++++++----------
>>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> index 6b6524f04ce0..6db4c58ddc13 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>>> @@ -149,9 +149,6 @@ static void amdgpu_irq_callback(struct
>>> amdgpu_device *adev,
>>> if (!amdgpu_ih_prescreen_iv(adev))
>>> return;
>>>
>>> - /* Before dispatching irq to IP blocks, send it to amdkfd */
>>> - amdgpu_amdkfd_interrupt(adev, (const void *)
>>> &ih->ring[ring_index]);
>>> -
>>> entry.iv_entry = (const uint32_t *)&ih->ring[ring_index];
>>> amdgpu_ih_decode_iv(adev, &entry);
>>>
>>> @@ -371,29 +368,31 @@ void amdgpu_irq_dispatch(struct amdgpu_device
>>> *adev,
>>> unsigned client_id = entry->client_id;
>>> unsigned src_id = entry->src_id;
>>> struct amdgpu_irq_src *src;
>>> + bool handled = false;
>>> int r;
>>>
>>> trace_amdgpu_iv(entry);
>>>
>>> if (client_id >= AMDGPU_IRQ_CLIENTID_MAX) {
>>> - DRM_DEBUG("Invalid client_id in IV: %d\n", client_id);
>>> + DRM_ERROR("Invalid client_id in IV: %d\n", client_id);
>>> return;
>>> }
>>>
>>> if (src_id >= AMDGPU_MAX_IRQ_SRC_ID) {
>>> - DRM_DEBUG("Invalid src_id in IV: %d\n", src_id);
>>> + DRM_ERROR("Invalid src_id in IV: %d\n", src_id);
>>> return;
>>> }
>>>
>>> if (adev->irq.virq[src_id]) {
>>>
>>> generic_handle_irq(irq_find_mapping(adev->irq.domain, src_id));
>>> - } else {
>>> - if (!adev->irq.client[client_id].sources) {
>>> - DRM_DEBUG("Unregistered interrupt client_id:
>>> %d src_id: %d\n",
>>> - client_id, src_id);
>>> - return;
>>> - }
>>> + return;
>>> + }
>>>
>>> + if (!adev->irq.client[client_id].sources) {
>>> + DRM_DEBUG("Unregistered interrupt client_id: %d
>>> src_id: %d\n",
>>> + client_id, src_id);
>>> + return;
>>> + } else {
>>> src = adev->irq.client[client_id].sources[src_id];
>>> if (!src) {
>>> DRM_DEBUG("Unhandled interrupt src_id: %d\n",
>>> src_id); @@ -401,9 +400,15 @@ void amdgpu_irq_dispatch(struct
>>> amdgpu_device *adev,
>>> }
>>>
>>> r = src->funcs->process(adev, src, entry);
>>> - if (r)
>>> + if (r < 0)
>>> DRM_ERROR("error processing interrupt (%d)\n",
>>> r);
>>> + else if (r)
>>> + handled = true;
>>> }
>>> +
>>> + /* Send it to amdkfd as well if it isn't already handled */
>>> + if (!handled)
>>> + amdgpu_amdkfd_interrupt(adev, entry->iv_entry);
>>> }
>>>
>>> /**
>>> --
>>> 2.17.1
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
More information about the amd-gfx
mailing list