[PATCwH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV

Thu Dec 19 22:44:25 UTC 2019

I'm thinking, if we know we're preparing for a GPU reset, maybe we 
shouldn't even try to suspend processes and stop the HIQ. 
kfd_suspend_all_processes, stop_cpsch and other functions up that call 
chain up to kgd2kfd_suspend could have a parameter (bool pre_reset) that 
would update the driver state but not touch the hardware. That avoids 
unnecessary timeouts on things that aren't expected to complete anyway.

Regards,
   Felix

On 2019-12-19 11:59 a.m., shaoyunl wrote:
>
> After check the code , in KFD side , should be simple just add the 
> check in stop_cpsch code . For kiq, there is no return for WREG32 , so 
> no easy way to check the return value . Maybe we can add kiq_status in 
> struct amdgpu_kiq  to indicate the kiq is hang or not ,  in 
> hdq_destroy function check this  kiq_status after acquire_queue , 
> finish the destroy function is kiq is hang for SRIOV only .
>
> Any comments ?
>
>
> shaoyun.liu
>
>
> On 2019-12-19 9:51 a.m., Liu, Shaoyun wrote:
>
>> I see, thanks for the detail information.
>> Normally when CP is hang, the hiq access to unmap the queue will 
>> failed before driver call to the hqd_destroy. I think driver should 
>> add the code to check the return value  and directly finish the 
>> pre_reset in this case . If the hiq does not hang but kiq hang. We 
>> can use the same logic in hqd_destroy function, return in first 
>> access failure instead go further.  With this change we probably can 
>> move the pre_reset function back to normal .
>> Felix, do you have any concerns or comments for the change.
>>
>> Regards
>> Shaoyun.liu
>> ------------------------------------------------------------------------
>> *From:* Liu, Monk <Monk.Liu at amd.com>
>> *Sent:* December 19, 2019 1:13:24 AM
>> *To:* Liu, Shaoyun <Shaoyun.Liu at amd.com>; 
>> amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>
>> *Subject:* RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR 
>> of SRIOV
>>
>> >>> I would like to check why we need a special sequences for sriov 
>> on this pre_reset. If possible, make it the same as bare metal mode 
>> sounds better solution.
>>
>> Because before VF FLR calling function would lead to register access 
>> through KIQ,  which will not complete because KIQ/GFX already hang by 
>> that time
>>
>> >>> I don't remember any register access by amdkfd_pre_reset call,   
>> please let me know if this assumption is wrong .
>>
>> Please check “void pm_uninit(struct packet_manager *pm)” which is 
>> invoked inside of amdkfd_pre_reset() :
>>
>> It will call uninitialized() in kfd_kernel_queue.c file
>>
>> And then go to the path of “kq->mqd_mgr->destroy_mqd(…)”
>>
>> And finally it calls “static int kgd_hqd_destroy(…)” in 
>> amdgpu_amdkfd_gfx_v10.c
>>
>> 539 {
>>
>> 540     struct amdgpu_device *adev = get_amdgpu_device(kgd);
>>
>> 541     enum hqd_dequeue_request_type type;
>>
>> 542     unsigned long end_jiffies;
>>
>> 543     uint32_t temp;
>>
>> 544     struct v10_compute_mqd *m = get_mqd(mqd);
>>
>> 545
>>
>> 546 #if 0
>>
>> 547     unsigned long flags;
>>
>> 548     int retry;
>>
>> 549 #endif
>>
>> 550
>>
>> 551     acquire_queue(kgd, pipe_id, queue_id); //this introduce 
>> register access via KIQ
>>
>> 552
>>
>> 553     if (m->cp_hqd_vmid == 0)
>>
>> 554         WREG32_FIELD15(GC, 0, RLC_CP_SCHEDULERS, scheduler1, 0); 
>> //this introduce register access via KIQ
>>
>> 555
>>
>> 556     switch (reset_type) {
>>
>> 557     case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
>>
>> 558         type = DRAIN_PIPE;
>>
>> 559         break;
>>
>> 560     case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
>>
>> 561         type = RESET_WAVES;
>>
>> 562         break;
>>
>> 563     default:
>>
>> 564         type = DRAIN_PIPE;
>>
>> 565         break;
>>
>> 566     }
>>
>> 624     WREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_DEQUEUE_REQUEST), 
>> type); //this introduce register access via KIQ
>>
>> 625
>>
>> 626     end_jiffies = (utimeout * HZ / 1000) + jiffies;
>>
>> 627     while (true) {
>>
>> 628         temp = RREG32(SOC15_REG_OFFSET(GC, 0, mmCP_HQD_ACTIVE)); 
>> //this introduce register access via KIQ
>>
>> 629         if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
>>
>> 630             break;
>>
>> 631         if (time_after(jiffies, end_jiffies)) {
>>
>> 632             pr_err("cp queue preemption time out.\n");
>>
>> 633             release_queue(kgd);
>>
>> 634             return -ETIME;
>>
>> 635         }
>>
>> 636         usleep_range(500, 1000);
>>
>> 637     }
>>
>> 638
>>
>> 639     release_queue(kgd);
>>
>> 640     return 0;
>>
>> If we use the sequence from bare-metal, all above highlighted 
>> register access will not work because KIQ/GFX already died by that 
>> time which means the amdkfd_pre_reset() is actually  not working as 
>> expected.
>>
>> _____________________________________
>>
>> Monk Liu|GPU Virtualization Team |AMD
>>
>> sig-cloud-gpu
>>
>> *From:* Liu, Shaoyun <Shaoyun.Liu at amd.com>
>> *Sent:* Thursday, December 19, 2019 12:30 PM
>> *To:* Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
>> *Subject:* Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR 
>> of SRIOV
>>
>> I don't remember any register access by amdkfd_pre_reset call,   
>> please let me know if this assumption is wrong .
>> This function will use hiq to access CP, in case CP already hang, we 
>> might not able to get the response from hw and will got a timeout. I 
>> think kfd internal should handle this. Felix already have some 
>> comments on that.
>> I would like to check why we need a special sequences for sriov on 
>> this pre_reset. If possible, make it the same as bare metal mode 
>> sounds better solution.
>>
>> Regards
>> Shaoyun.liu
>>
>> ------------------------------------------------------------------------
>>
>> *From:*Liu, Monk <Monk.Liu at amd.com <mailto:Monk.Liu at amd.com>>
>> *Sent:* December 18, 2019 10:52:47 PM
>> *To:* Liu, Shaoyun <Shaoyun.Liu at amd.com 
>> <mailto:Shaoyun.Liu at amd.com>>; amd-gfx at lists.freedesktop.org 
>> <mailto:amd-gfx at lists.freedesktop.org> <amd-gfx at lists.freedesktop.org 
>> <mailto:amd-gfx at lists.freedesktop.org>>
>> *Subject:* RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR 
>> of SRIOV
>>
>> Oh, by the way
>>
>> >>> Do we know the root cause why this function would ruin MEC ?
>>
>> Only we call this function right after VF FLR will ruin MEC and lead 
>> to following KIQ ring test fail , and on bare-metal it is called 
>> before gpu rest , so that's why on bare-metal we don't have this issue
>>
>> But the reason we cannot call it before VF FLR on SRIOV case was 
>> already stated in this thread
>>
>> Thanks
>> _____________________________________
>> Monk Liu|GPU Virtualization Team |AMD
>>
>>
>> -----Original Message-----
>> From: Liu, Monk
>> Sent: Thursday, December 19, 2019 11:49 AM
>> To: shaoyunl <shaoyun.liu at amd.com <mailto:shaoyun.liu at amd.com>>; 
>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>> Subject: RE: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of 
>> SRIOV
>>
>> Hi Shaoyun
>>
>> >>> Do we know the root cause why this function would ruin MEC ? From 
>> the logic, I think this function should be called before FLR since we 
>> need to disable the user queue submission first.
>> Right now I don't know which detail step lead to KIQ ring test fail, 
>> I totally agree with you that this func should be called before VF 
>> FLR, but we cannot do it and the reason is described in The comment:
>>
>> > if we do pre_reset() before VF FLR, it would go KIQ way to do register
>> > access and stuck there, because KIQ probably won't work by that time
>> > (e.g. you already made GFX hang)
>>
>>
>> >>> I remembered the function should use hiq to communicate with HW , 
>> shouldn't use kiq to access HW registerm,  has this been changed ?
>> Tis function use WREG32/RREG32 to do register access, like all other 
>> functions in KMD,  and WREG32/RREG32 will let KIQ to do the register 
>> access If we are under dynamic SRIOV  mode (means we are SRIOV VF and 
>> isn't under full exclusive mode)
>>
>> You see that if you call this func before EVENT_5 (event 5 triggers 
>> VF FLR) then it will run under dynamic mode and KIQ will handle the 
>> register access, which is not an option Since ME/MEC probably already 
>> hang ( if we are testing quark on gfx/compute rings)
>>
>> Do you have a good suggestion ?
>>
>> thanks
>> _____________________________________
>> Monk Liu|GPU Virtualization Team |AMD
>>
>>
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org 
>> <mailto:amd-gfx-bounces at lists.freedesktop.org>> On Behalf Of shaoyunl
>> Sent: Tuesday, December 17, 2019 11:38 PM
>> To: amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>> Subject: Re: [PATCH 2/2] drm/amdgpu: fix KIQ ring test fail in TDR of 
>> SRIOV
>>
>> I think amdkfd side depends on this call to stop the user queue, 
>> without this call, the user queue can submit to HW during the reset 
>> which could cause hang again ...
>> Do we know the root cause why this function would ruin MEC ? From the 
>> logic, I think this function should be called before FLR since we 
>> need to disable the user queue submission first.
>> I remembered the function should use hiq to communicate with HW , 
>> shouldn't use kiq to access HW registerm,  has this been changed ?
>>
>>
>> Regards
>> shaoyun.liu
>>
>>
>> On 2019-12-17 5:19 a.m., Monk Liu wrote:
>> > issues:
>> > MEC is ruined by the amdkfd_pre_reset after VF FLR done
>> >
>> > fix:
>> > amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF
>> > FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but
>> > there is a limitation to block this sequence:
>> > if we do pre_reset() before VF FLR, it would go KIQ way to do register
>> > access and stuck there, because KIQ probably won't work by that time
>> > (e.g. you already made GFX hang)
>> >
>> > so the best way right now is to simply remove it.
>> >
>> > Signed-off-by: Monk Liu <Monk.Liu at amd.com <mailto:Monk.Liu at amd.com>>
>> > ---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 --
>> >   1 file changed, 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > index 605cef6..ae962b9 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> > @@ -3672,8 +3672,6 @@ static int amdgpu_device_reset_sriov(struct 
>> amdgpu_device *adev,
>> >        if (r)
>> >                return r;
>> >
>> > -     amdgpu_amdkfd_pre_reset(adev);
>> > -
>> >        /* Resume IP prior to SMC */
>> >        r = amdgpu_device_ip_reinit_early_sriov(adev);
>> >        if (r)
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org <mailto:amd-gfx at lists.freedesktop.org>
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx