<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from text --><style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<meta content="text/html; charset=UTF-8">
<style type="text/css" style="">
<!--
p
{margin-top:0;
margin-bottom:0}
-->
</style>
<div dir="ltr">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:#000000; font-family:Calibri,Arial,Helvetica,sans-serif">
<p>no, we already make it work, and suspend is totally nonsense for SRIOV,</p>
<p>because when guest side need a GPU reset, hypervisor will do VF flr, so after vf flr amdgpu_suspen() is redundant at all.</p>
<p><br>
</p>
<p>those 20 patches serial is verified on KVM/XEN platform, it can recover back from guest hang.
<br>
</p>
<p><br>
</p>
<p><br>
</p>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>发件人:</b> Christian König <deathsimple@vodafone.de><br>
<b>发送时间:</b> 2017年2月7日 19:12:04<br>
<b>收件人:</b> Liu, Monk; amd-gfx@lists.freedesktop.org<br>
<b>主题:</b> Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:10pt;">
<div class="PlainText">> Because gpu_srio_reset (will send patch for this routine later) doesn't call amdgpu_suspend()<br>
That's most likely not a good idea.<br>
<br>
Suspend and resume should always be paired, otherwise you run into <br>
exactly those problems and the GART is most likely only the tip of the <br>
iceberg here.<br>
<br>
For example you also mess up the reference counting for buffer <br>
containing the UVD and VCE firmware (ok that won't affect SRIOV for now).<br>
<br>
Maybe you just want to call hw_init() instead of a resume here?<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 06.02.2017 um 16:55 schrieb Liu, Monk:<br>
> I recall why I made this patch<br>
><br>
> When testing SRIOV gpu reset feature, I it will always waiting and not return if without this patch, with more look into it:<br>
><br>
> Because gpu_srio_reset (will send patch for this routine later) doesn't call amdgpu_suspend(), so the gart table BO won't get unpin, which lead to driver infinite wait loop if we pin it again in resume.<br>
> <br>
> For bare-metal case, gpu_reset will call amdgpu_suspend so the gart bo will unpin.<br>
><br>
> BTW:<br>
> GPU_SRIOV_RESET is invoked after HYPERVISOR call VF_FLR on this vf device, so all IP blocks's suspend routine is not needed at all.<br>
><br>
> What about:<br>
>>> + if (adev->gart.table_addr && amdgpu_sriov_vf(adev)) {<br>
>>> + /* it's a resume call, gart already pin */<br>
>>> + return 0;<br>
>>> + }<br>
><br>
> BR Monk<br>
><br>
><br>
> -----Original Message-----<br>
> From: Christian König [<a href="mailto:deathsimple@vodafone.de">mailto:deathsimple@vodafone.de</a>]<br>
> Sent: Monday, February 06, 2017 10:31 PM<br>
> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org<br>
> Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin<br>
><br>
> Hui? We shouldn't need to call this function from a GPU reset, do we really do so?<br>
><br>
> But even if we call it from GPU reset we certainly should have called the matching unpin function before.<br>
><br>
> Otherwise we certainly won't be able to resume from the next suspend after a GPU reset.<br>
><br>
> Regards,<br>
> Christian.<br>
><br>
> Am 06.02.2017 um 15:25 schrieb Liu, Monk:<br>
>> Emmmm looks like I missed the part of S3 function<br>
>><br>
>> But if this is from a GPU reset , we also shouldn't continue run this<br>
>> function otherwise GPU reset will fail (SRIOV reset test)<br>
>><br>
>> BR Monk<br>
>><br>
>> -----Original Message-----<br>
>> From: Christian König [<a href="mailto:deathsimple@vodafone.de">mailto:deathsimple@vodafone.de</a>]<br>
>> Sent: Monday, February 06, 2017 4:14 PM<br>
>> To: Liu, Monk <Monk.Liu@amd.com>; amd-gfx@lists.freedesktop.org<br>
>> Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin<br>
>><br>
>> A bug NAK on this! amdgpu_gart_table_vram_unpin() must be called during suspend.<br>
>><br>
>> Otherwise the GART table can be corrupted and we run into a whole bunch of problems.<br>
>><br>
>> We could add a "BUG_ON(adev->gart.table_addr != NULL);" here to double check that, but just ignoring that something went horrible wrong is clearly the wrong approach.<br>
>><br>
>> Regards,<br>
>> Christian.<br>
>><br>
>> Am 04.02.2017 um 11:34 schrieb Monk Liu:<br>
>>> if this call is from resume, shouldn't enter pin logic at all<br>
>>><br>
>>> Change-Id: I40a5cdc2a716c4c20d2812fd74ece4ea284b6765<br>
>>> Signed-off-by: Monk Liu <Monk.Liu@amd.com><br>
>>> ---<br>
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 +++++<br>
>>> 1 file changed, 5 insertions(+)<br>
>>><br>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c<br>
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c<br>
>>> index 964d2a9..5e907f7 100644<br>
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c<br>
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c<br>
>>> @@ -151,6 +151,11 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev)<br>
>>> uint64_t gpu_addr;<br>
>>> int r;<br>
>>> <br>
>>> + if (adev->gart.table_addr) {<br>
>>> + /* it's a resume call, gart already pin */<br>
>>> + return 0;<br>
>>> + }<br>
>>> +<br>
>>> r = amdgpu_bo_reserve(adev->gart.robj, false);<br>
>>> if (unlikely(r != 0))<br>
>>> return r;<br>
>> _______________________________________________<br>
>> amd-gfx mailing list<br>
>> amd-gfx@lists.freedesktop.org<br>
>> <a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
><br>
> _______________________________________________<br>
> amd-gfx mailing list<br>
> amd-gfx@lists.freedesktop.org<br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
<br>
<br>
</div>
</span></font>
</body>
</html>