<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Aptos;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Arial",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<p style="font-family:Calibri;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - AMD Internal Distribution Only]<br>
</p>
<br>
<div>
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Arial",sans-serif">I think to resume with different SRIOV vGPUs depends on the hypervisor has the live migration support . Different Hypervisor have different implementation , basically it will
call into the host gpu driver in different stage and host side do the hw related migration including the FW state.
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Arial",sans-serif"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Arial",sans-serif">Regards<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Arial",sans-serif">Shaoyun.liu
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Arial",sans-serif"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri",sans-serif"> amd-gfx <amd-gfx-bounces@lists.freedesktop.org>
<b>On Behalf Of </b>Christian König<br>
<b>Sent:</b> Tuesday, January 14, 2025 7:44 AM<br>
<b>To:</b> Gerry Liu <gerry@linux.alibaba.com><br>
<b>Cc:</b> Deucher, Alexander <Alexander.Deucher@amd.com>; Pan, Xinhui <Xinhui.Pan@amd.com>; airlied@gmail.com; simona@ffwll.ch; Khatri, Sunil <Sunil.Khatri@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>; Limonciello, Mario
<Mario.Limonciello@amd.com>; Chen, Xiaogang <Xiaogang.Chen@amd.com>; Russell, Kent <Kent.Russell@amd.com>; shuox.liu@linux.alibaba.com; amd-gfx@lists.freedesktop.org<br>
<b>Subject:</b> Re: [RFC v1 0/2] Enable resume with different AMD SRIOV vGPUs<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi Gerry,<br>
<br>
Am 14.01.25 um 12:03 schrieb Gerry Liu: <o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>2025<span lang="ZH-CN" style="font-family:DengXian">年</span>1<span lang="ZH-CN" style="font-family:DengXian">月</span>14<span lang="ZH-CN" style="font-family:DengXian">日</span> 18:46<span lang="ZH-CN" style="font-family:DengXian">,</span>Christian König <a href="mailto:christian.koenig@amd.com"><christian.koenig@amd.com></a> <span lang="ZH-CN" style="font-family:DengXian">写道:</span><o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Hi Jiang,<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Some of the firmware, especially the multimedia ones, keep FW pointers to buffers in the suspend/resume state.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>In other words the firmware needs to be in the exact same location before and after resume. That's why we don't unpin the firmware BOs, but rather save their content and restore it. See function amdgpu_vcn_save_vcpu_bo() for reference.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Additional to that the serial numbers, IDs etc are used for things like TMZ. So anything which uses HW encryption won't work any more.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Then even two identical boards can have different harvest and memory channel configurations. Could be that we might be able to abstract that with SR-IOV but I won't rely on that.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>To summarize that looks like a completely futile effort which most likely won't work reliable in a production environment.<o:p></o:p></pre>
</blockquote>
<pre>Hi Christian,<o:p></o:p></pre>
<pre> Thanks for the information. Previously I assume that we may reset the asic and reload all firmwares on resume, but missed the vcn ip block which save and restore firmware vram content during suspend/resume. Is there any other IP blocks which save and restore firmware ram content?<o:p></o:p></pre>
</blockquote>
<p class="MsoNormal"><br>
Not that I of hand know of any, but neither the hypervisor nor the driver stack was designed with something like this in mind. So could be that there are other dependencies I don't know about.<br>
<br>
I do remember that this idea of resuming on different HW than suspending came up a while ago and was rejected by multiple parties as to complicated and error prone.<br>
<br>
So we never looked more deeply into the possibility of doing that.<br>
<br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre><o:p> </o:p></pre>
<pre> Our usage scenario targets GPGPU workload (amdkfd) with AMD GPU in single SR-IOV vGPU mode. Is it possible to resume on a different vGPU device in such a case?<o:p></o:p></pre>
</blockquote>
<p class="MsoNormal"><br>
If I'm not completely mistaken you can use checkpoint/restore for that. It's still under development, but as far as I can see it should solve your problem quite nicely.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre><o:p> </o:p></pre>
<pre>Regards,<o:p></o:p></pre>
<pre>Gerry <o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre><o:p> </o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre><o:p> </o:p></pre>
<pre>Regards,<o:p></o:p></pre>
<pre>Christian.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Am 14.01.25 um 10:54 schrieb Jiang Liu:<o:p></o:p></pre>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre>For virtual machines with AMD SR-IOV vGPUs, following work flow may be<o:p></o:p></pre>
<pre>used to support virtual machine hibernation(suspend):<o:p></o:p></pre>
<pre>1) suspends a virtual machine with AMD vGPU A.<o:p></o:p></pre>
<pre>2) hypervisor dumps guest RAM content to a disk image.<o:p></o:p></pre>
<pre>3) hypervisor loads the guest system image from disk.<o:p></o:p></pre>
<pre>4) resumes the guest OS with a different AMD vGPU B.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>The step 4 above is special because we are resuming with a different<o:p></o:p></pre>
<pre>AMD vGPU device and the amdgpu driver may observe changed device<o:p></o:p></pre>
<pre>properties. To support above work flow, we need to fix those changed<o:p></o:p></pre>
<pre>device properties cached by the amdgpu drivers.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>With information from the amdgpu driver source code (haven't read<o:p></o:p></pre>
<pre>corresponding hardware specs yet), we have identified following changed<o:p></o:p></pre>
<pre>device properties:<o:p></o:p></pre>
<pre>1) PCI MMIO address. This can be fixed by hypervisor.<o:p></o:p></pre>
<pre>2) serial_number, unique_id, xgmi_device_id, fru_id in sysfs. Seems<o:p></o:p></pre>
<pre> they are information only.<o:p></o:p></pre>
<pre>3) xgmi_physical_id if xgmi is enabled, which affects VRAM MC address.<o:p></o:p></pre>
<pre>4) mc_fb_offset, which affects VRAM physical address.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>We will focus on the VRAM address related changes here, because it's<o:p></o:p></pre>
<pre>sensitive to the GPU functionalities. The original data sources include<o:p></o:p></pre>
<pre>.get_mc_fb_offset(), .get_fb_location() and xgmi hardware registers.<o:p></o:p></pre>
<pre>The main data cached by amdgpu driver are adev->gmc.vram_start and<o:p></o:p></pre>
<pre>adev->vm_manager.vram_base_offset. And the major consumers of the<o:p></o:p></pre>
<pre>cached information are ip_block.hw_init() and GMU page table builder.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>After code analysis, we found that most consumers of dev->gmc.vram_start<o:p></o:p></pre>
<pre>and adev->vm_manager.vram_base_offset directly read value from these<o:p></o:p></pre>
<pre>two variables on demand instead of caching them. So if we fix these<o:p></o:p></pre>
<pre>two cached fields on resume, everything should work as expected.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>But there's an exception, and an very import exception, that callers<o:p></o:p></pre>
<pre>of amdgpu_bo_create_kernel()/amdgpu_bo_create_reserved() may cache<o:p></o:p></pre>
<pre>VRAM addresses. With further analysis, the callers of these interface<o:p></o:p></pre>
<pre>have three different patterns:<o:p></o:p></pre>
<pre>1) This pattern is safe.<o:p></o:p></pre>
<pre> - call amdgpu_bo_create_reserved() in ip_block.hw_init()<o:p></o:p></pre>
<pre> - call amdgpu_bo_free_kernel() in ip_block.suspend()<o:p></o:p></pre>
<pre> - call amdgpu_bo_create_reserved() in ip_block.resume()<o:p></o:p></pre>
<pre>2) This pattern works with current implementaiton of amdgpu_bo_create_reserved()<o:p></o:p></pre>
<pre> but bo.pin_count gets incorrect.<o:p></o:p></pre>
<pre> - call amdgpu_bo_create_reserved() in ip_block.hw_init()<o:p></o:p></pre>
<pre> - call amdgpu_bo_create_reserved() in ip_block.resume()<o:p></o:p></pre>
<pre>3) This pattern needs to be enhanced.<o:p></o:p></pre>
<pre> - call amdgpu_bo_create_reserved() in ip_block.sw_init()<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>So my question is which pattern should we use here? Personally I prefer<o:p></o:p></pre>
<pre>pattern 2 with enhancement to fix the bo.pin_count.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Currently there're still bugs in SRIOV suspend/resume, so we can't test<o:p></o:p></pre>
<pre>our hypothesis. And we are not sure whether there are still other<o:p></o:p></pre>
<pre>blocking to enable resume with different AMD SR-IOV vGPUs.<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Help is needed to identify more task items to enable resume with<o:p></o:p></pre>
<pre>different AMD SR-IOV vGPUs:)<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Jiang Liu (2):<o:p></o:p></pre>
<pre> drm/amdgpu: update cached vram base addresses on resume<o:p></o:p></pre>
<pre> drm/amdgpu: introduce helper amdgpu_bo_get_pinned_gpu_addr()<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++++++++++++++<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++++--<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +++++++++<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 +<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 9 +++++++++<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 7 +++++++<o:p></o:p></pre>
<pre> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 ++++++<o:p></o:p></pre>
<pre> 7 files changed, 51 insertions(+), 2 deletions(-)<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
</blockquote>
</blockquote>
<pre><o:p> </o:p></pre>
</blockquote>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</body>
</html>