[PATCwH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

Fri Dec 20 00:27:52 UTC 2019

I can prepare a patch, but I can't test it thoroughly. It would need to 
be tested on bare-metal and SRIOV.

Regards,
   Felix

On 2019-12-19 18:25, shaoyunl wrote:
> How we prevent the  user queue from submitting on the  following FLR  
> if we didn't unmap the  user queues . It's possible that CP still not 
> hang when other part HW get hang and  need a reset .
>
> Om, but probably it's ok since after FLR , all the hqd will be reset 
> to unmapped by default by HW  and existing user queue need to be 
> re-created anyway ...
>
> I think i'm ok with your proposal . Can KFD team prepare the change ?
>
>
> Regards
>
> shaoyun.liu
>
>
>
> On 2019-12-19 5:44 p.m., Felix Kuehling wrote:
>> I'm thinking, if we know we're preparing for a GPU reset, maybe we 
>> shouldn't even try to suspend processes and stop the HIQ. 
>> kfd_suspend_all_processes, stop_cpsch and other functions up that 
>> call chain up to kgd2kfd_suspend could have a parameter (bool 
>> pre_reset) that would update the driver state but not touch the 
>> hardware. That avoids unnecessary timeouts on things that aren't 
>> expected to complete anyway.
>>
>> Regards,
>>   Felix
>>
>>
>> On 2019-12-19 11:59 a.m., shaoyunl wrote:
>>>
>>> After check the code , in KFD side , should be simple just add the 
>>> check in stop_cpsch code . For kiq, there is no return for WREG32 , 
>>> so no easy way to check the return value . Maybe we can add 
>>> kiq_status in struct amdgpu_kiq  to indicate the kiq is hang or not 
>>> ,  in hdq_destroy function check this kiq_status after acquire_queue 
>>> , finish the destroy function is kiq is hang for SRIOV only .
>>>
>>> Any comments ?
>>>
>>>
>>> shaoyun.liu
>>>
>>>
>>> On 2019-12-19 9:51 a.m., Liu, Shaoyun wrote:
>>>
>>>> I see, thanks for the detail information.
>>>> Normally when CP is hang, the hiq access to unmap the queue will 
>>>> failed before driver call to the hqd_destroy. I think driver should 
>>>> add the code to check the return value  and directly finish the 
>>>> pre_reset in this case . If the hiq does not hang but kiq hang. We 
>>>> can use the same logic in hqd_destroy function, return in first 
>>>> access failure instead go further.  With this change we probably 
>>>> can move the pre_reset function back to normal .
>>>> Felix, do you have any concerns or comments for the change.
>>>>
>>>> Regards
>>>> Shaoyun.liu
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> *From:* Liu, Monk <Monk.Liu at amd.com>
>>>> *Sent:* December 19, 2019 1:13:24 AM
>>>> *To:* Liu, Shaoyun <Shaoyun.Liu at amd.com>; 
>>>> amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
>>>> *Subject:* RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in 
>>>> TDR of SRIOV
>>>>
>>>> >>> I would like to check why we need a special sequences for sriov 
>>>> on this pre_reset. If possible, make it the same as bare metal mode 
>>>> sounds better solution.
>>>>
>>>> Because before VF FLR calling function would lead to register 
>>>> access through KIQ,  which will not complete because KIQ/GFX 
>>>> already hang by that time
>>>>
>>>> >>> I don't remember any register access by amdkfd_pre_reset 
>>>> call,   please let me know if this assumption is wrong .
>>>>
>>>> Please check “void pm_uninit(struct packet_manager *pm)” which is 
>>>> invoked inside of amdkfd_pre_reset() :
>>>>
>>>> It will call uninitialized() in kfd_kernel_queue.c file
>>>>
>>>> And then go to the path of “kq->mqd_mgr->destroy_mqd(…)”
>>>>
>>>> And finally it calls “static int kgd_hqd_destroy(…)” in 
>>>> amdgpu_amdkfd_gfx_v10.c
>>>>
>>>> 539 {
>>>>
>>>> 540     struct amdgpu_device *adev = get_amdgpu_device(kgd);
>>>>
>>>> 541     enum hqd_dequeue_request_type type;
>>>>
>>>> 542     unsigned long end_jiffies;
>>>>
>>>> 543     uint32_t temp;
>>>>
>>>> 544     struct v10_compute_mqd *m = get_mqd(mqd);
>>>>
>>>> 545
>>>>
>>>> 546 #if 0
>>>>
>>>> 547     unsigned long flags;
>>>>
>>>> 548     int retry;
>>>>
>>>> 549 #endif
>>>>
>>>> 550
>>>>
>>>> 551     acquire_queue(kgd, pipe_id, queue_id); //this introduce 
>>>> register access via KIQ
>>>>
>>>> 552
>>>>
>>>> 553     if (m->cp_hqd_vmid == 0)
>>>>
>>>> 554         WREG32_FIELD15(GC, 0, RLC_CP_SCHEDULERS, scheduler1, 
>>>> 0); //this introduce register access via KIQ
>>>>
>>>> 555
>>>>
>>>> 556     switch (reset_type) {
>>>>
>>>> 557     case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
>>>>
>>>> 558         type = DRAIN_PIPE;
>>>>
>>>> 559         break;
>>>>
>>>> 560     case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
>>>>
>>>> 561         type = RESET_WAVES;
>>>>
>>>> 562         break;
>>>>
>>>> 563     default:
>>>>
>>>> 564         type = DRAIN_PIPE;
>>>>
>>>> 565         break;
>>>>
>>>> 566     }
>>>>
>>>> 624     WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_DEQUEUE_REQUEST), 
>>>> type); //this introduce register access via KIQ
>>>>
>>>> 625
>>>>
>>>> 626     end_jiffies = (utimeout * HZ / 1000) + jiffies;
>>>>
>>>> 627     while (true) {
>>>>
>>>> 628         temp = RREG32(SOC15_REG_OFFSET(GC, 0, 
>>>> mmCP_HQD_ACTIVE)); //this introduce register access via KIQ
>>>>
>>>> 629         if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
>>>>
>>>> 630             break;
>>>>
>>>> 631         if (time_after(jiffies, end_jiffies)) {
>>>>
>>>> 632             pr_err("cp queue preemption time out.\n");
>>>>
>>>> 633             release_queue(kgd);
>>>>
>>>> 634             return -ETIME;
>>>>
>>>> 635         }
>>>>
>>>> 636         usleep_range(500, 1000);
>>>>
>>>> 637     }
>>>>
>>>> 638
>>>>
>>>> 639     release_queue(kgd);
>>>>
>>>> 640     return 0;
>>>>
>>>> If we use the sequence from bare-metal, all above highlighted 
>>>> register access will not work because KIQ/GFX already died by that 
>>>> time which means the amdkfd_pre_reset() is actually  not working as 
>>>> expected.
>>>>
>>>> _____________________________________
>>>>
>>>> Monk Liu|GPU Virtualization Team |AMD
>>>>
>>>> sig-cloud-gpu
>>>>
>>>> *From:* Liu, Shaoyun <Shaoyun.Liu at amd.com>
>>>> *Sent:* Thursday, December 19, 2019 12:30 PM
>>>> *To:* Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
>>>> *Subject:* Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in 
>>>> TDR of SRIOV
>>>>
>>>> I don't remember any register access by amdkfd_pre_reset call,   
>>>> please let me know if this assumption is wrong .
>>>> This function will use hiq to access CP, in case CP already hang, 
>>>> we might not able to get the response from hw and will got a 
>>>> timeout. I think kfd internal should handle this. Felix already 
>>>> have some comments on that.
>>>> I would like to check why we need a special sequences for sriov on 
>>>> this pre_reset. If possible, make it the same as bare metal mode 
>>>> sounds better solution.
>>>>
>>>> Regards
>>>> Shaoyun.liu
>>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>> *From:*Liu, Monk <Monk.Liu at amd.com <mailto:Monk.Liu at amd.com>>
>>>> *Sent:* December 18, 2019 10:52:47 PM
>>>> *To:* Liu, Shaoyun <Shaoyun.Liu at amd.com 
>>>> <mailto:Shaoyun.Liu at amd.com>>; amd-gfx at lists.freedesktop.org 
>>>> <mailto:amd-gfx at lists.freedesktop.org> 
>>>> <amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>>
>>>> *Subject:* RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in 
>>>> TDR of SRIOV
>>>>
>>>> Oh, by the way
>>>>
>>>> >>> Do we know the root cause why this function would ruin MEC ?
>>>>
>>>> Only we call this function right after VF FLR will ruin MEC and 
>>>> lead to following KIQ ring test fail , and on bare-metal it is 
>>>> called before gpu rest , so that's why on bare-metal we don't have 
>>>> this issue
>>>>
>>>> But the reason we cannot call it before VF FLR on SRIOV case was 
>>>> already stated in this thread
>>>>
>>>> Thanks
>>>> _____________________________________
>>>> Monk Liu|GPU Virtualization Team |AMD
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Liu, Monk
>>>> Sent: Thursday, December 19, 2019 11:49 AM
>>>> To: shaoyunl <shaoyun.liu at amd.com <mailto:shaoyun.liu at amd.com>>; 
>>>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>>>> Subject: RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR 
>>>> of SRIOV
>>>>
>>>> Hi Shaoyun
>>>>
>>>> >>> Do we know the root cause why this function would ruin MEC ? 
>>>> From the logic, I think this function should be called before FLR 
>>>> since we need to disable the user queue submission first.
>>>> Right now I don't know which detail step lead to KIQ ring test 
>>>> fail, I totally agree with you that this func should be called 
>>>> before VF FLR, but we cannot do it and the reason is described in 
>>>> The comment:
>>>>
>>>> > if we do pre_reset() before VF FLR, it would go KIQ way to do 
>>>> register
>>>> > access and stuck there, because KIQ probably won't work by that time
>>>> > (e.g. you already made GFX hang)
>>>>
>>>>
>>>> >>> I remembered the function should use hiq to communicate with HW 
>>>> , shouldn't use kiq to access HW registerm,  has this been changed ?
>>>> Tis function use WREG32/RREG32 to do register access, like all 
>>>> other functions in KMD,  and WREG32/RREG32 will let KIQ to do the 
>>>> register access If we are under dynamic SRIOV mode (means we are 
>>>> SRIOV VF and isn't under full exclusive mode)
>>>>
>>>> You see that if you call this func before EVENT_5 (event 5 triggers 
>>>> VF FLR) then it will run under dynamic mode and KIQ will handle the 
>>>> register access, which is not an option Since ME/MEC probably 
>>>> already hang ( if we are testing quark on gfx/compute rings)
>>>>
>>>> Do you have a good suggestion ?
>>>>
>>>> thanks
>>>> _____________________________________
>>>> Monk Liu|GPU Virtualization Team |AMD
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org 
>>>> <mailto:amd-gfx-bounces at lists.freedesktop.org>> On Behalf Of shaoyunl
>>>> Sent: Tuesday, December 17, 2019 11:38 PM
>>>> To: amd-gfx at lists.freedesktop.org 
>>>> <mailto:amd-gfx at lists.freedesktop.org>
>>>> Subject: Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR 
>>>> of SRIOV
>>>>
>>>> I think amdkfd side depends on this call to stop the user queue, 
>>>> without this call, the user queue can submit to HW during the reset 
>>>> which could cause hang again ...
>>>> Do we know the root cause why this function would ruin MEC ? From 
>>>> the logic, I think this function should be called before FLR since 
>>>> we need to disable the user queue submission first.
>>>> I remembered the function should use hiq to communicate with HW , 
>>>> shouldn't use kiq to access HW registerm,  has this been changed ?
>>>>
>>>>
>>>> Regards
>>>> shaoyun.liu
>>>>
>>>>
>>>> On 2019-12-17 5:19 a.m., Monk Liu wrote:
>>>> > issues:
>>>> > MEC is ruined by the amdkfd_pre_reset after VF FLR done
>>>> >
>>>> > fix:
>>>> > amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF
>>>> > FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but
>>>> > there is a limitation to block this sequence:
>>>> > if we do pre_reset() before VF FLR, it would go KIQ way to do 
>>>> register
>>>> > access and stuck there, because KIQ probably won't work by that time
>>>> > (e.g. you already made GFX hang)
>>>> >
>>>> > so the best way right now is to simply remove it.
>>>> >
>>>> > Signed-off-by: Monk Liu <Monk.Liu at amd.com <mailto:Monk.Liu at amd.com>>
>>>> > ---
>>>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 --
>>>> >   1 file changed, 2 deletions(-)
>>>> >
>>>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> > index 605cef6..ae962b9 100644
>>>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> > @@ -3672,8 +3672,6 @@ static int amdgpu_device_reset_sriov(struct 
>>>> amdgpu_device *adev,
>>>> >        if (r)
>>>> >                return r;
>>>> >
>>>> > -     amdgpu_amdkfd_pre_reset(adev);
>>>> > -
>>>> >        /* Resume IP prior to SMC */
>>>> >        r = amdgpu_device_ip_reinit_early_sriov(adev);
>>>> >        if (r)
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx