<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta name="Generator" content="Microsoft Exchange Server"> <style></style> </head> <body> <meta content="text/html; charset=UTF-8"> <style type="text/css" style="">  </style> <div dir="ltr"> <div id="x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Arial,Helvetica,sans-serif"> no, we already make it work, and suspend is totally nonsense for SRIOV, because when guest side need a GPU reset, hypervisor will do VF flr, so after vf flr amdgpu_suspen() is redundant at all. those 20 patches serial is verified on KVM/XEN platform, it can recover back from guest hang. </div> <hr tabindex="-1" style="display:inline-block; width:98%"> <div id="x_divRplyFwdMsg" dir="ltr">发件人: Christian König <deathsimple@vodafone.de> 发送时间: 2017年2月7日 19:12:04 收件人: Liu, Monk; amd-gfx@lists.freedesktop.org 主题: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin <div> </div> </div> </div> <div class="PlainText">> Because gpu_srio_reset (will send patch for this routine later) doesn't call amdgpu_suspend() That's most likely not a good idea. Suspend and resume should always be paired, otherwise you run into exactly those problems and the GART is most likely only the tip of the iceberg here. For example you also mess up the reference counting for buffer containing the UVD and VCE firmware (ok that won't affect SRIOV for now). Maybe you just want to call hw_init() instead of a resume here? Regards, Christian. Am 06.02.2017 um 16:55 schrieb Liu, Monk: > I recall why I made this patch > > When testing SRIOV gpu reset feature, I it will always waiting and not return if without this patch, with more look into it: > > Because gpu_srio_reset (will send patch for this routine later) doesn't call amdgpu_suspend(), so the gart table BO won't get unpin, which lead to driver infinite wait loop if we pin it again in resume. > > For bare-metal case, gpu_reset will call amdgpu_suspend so the gart bo will unpin. > > BTW: > GPU_SRIOV_RESET is invoked after HYPERVISOR call VF_FLR on this vf device, so all IP blocks's suspend routine is not needed at all. > > What about: >>> + if (adev->gart.table_addr && amdgpu_sriov_vf(adev)) { >>> + /* it's a resume call, gart already pin */ >>> + return 0; >>> + } > > BR Monk > > > -----Original Message----- > From: Christian König [<a href="mailto:deathsimple@vodafone.de">mailto:deathsimple@vodafone.de</a>] > Sent: Monday, February 06, 2017 10:31 PM > To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin > > Hui? We shouldn't need to call this function from a GPU reset, do we really do so? > > But even if we call it from GPU reset we certainly should have called the matching unpin function before. > > Otherwise we certainly won't be able to resume from the next suspend after a GPU reset. > > Regards, > Christian. > > Am 06.02.2017 um 15:25 schrieb Liu, Monk: >> Emmmm looks like I missed the part of S3 function >> >> But if this is from a GPU reset , we also shouldn't continue run this >> function otherwise GPU reset will fail (SRIOV reset test) >> >> BR Monk >> >> -----Original Message----- >> From: Christian König [<a href="mailto:deathsimple@vodafone.de">mailto:deathsimple@vodafone.de</a>] >> Sent: Monday, February 06, 2017 4:14 PM >> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org >> Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin >> >> A bug NAK on this! amdgpu_gart_table_vram_unpin() must be called during suspend. >> >> Otherwise the GART table can be corrupted and we run into a whole bunch of problems. >> >> We could add a "BUG_ON(adev->gart.table_addr != NULL);" here to double check that, but just ignoring that something went horrible wrong is clearly the wrong approach. >> >> Regards, >> Christian. >> >> Am 04.02.2017 um 11:34 schrieb Monk Liu: >>> if this call is from resume, shouldn't enter pin logic at all >>> >>> Change-Id: I40a5cdc2a716c4c20d2812fd74ece4ea284b6765 >>> Signed-off-by: Monk Liu <Monk.Liu@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c >>> index 964d2a9..5e907f7 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c >>> @@ -151,6 +151,11 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev) >>> uint64_t gpu_addr; >>> int r; >>> >>> + if (adev->gart.table_addr) { >>> + /* it's a resume call, gart already pin */ >>> + return 0; >>> + } >>> + >>> r = amdgpu_bo_reserve(adev->gart.robj, false); >>> if (unlikely(r != 0)) >>> return r; >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> <a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a> > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > <a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a> </div> </body> </html>