<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Aptos",sans-serif;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="en-CN" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<p style="font-family:Calibri;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - AMD Internal Distribution Only]<br>
</p>
<br>
<div>
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">Hi Christian,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">Thank you for the feedback.
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">For “return ret < 0 ? ret : 0;”, it is equivalent to “return ret;” since ret is always <= 0 after the loop.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">For all other comments, I will revise the patch accordingly in v2.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">Regards<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:12.0pt">Sam<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:12.0pt"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">Koenig, Christian <Christian.Koenig@amd.com><br>
<b>Date: </b>Monday, June 30, 2025 at 19:54<br>
<b>To: </b>Zhang, GuoQing (Sam) <GuoQing.Zhang@amd.com>, rafael@kernel.org <rafael@kernel.org>, len.brown@intel.com <len.brown@intel.com>, pavel@kernel.org <pavel@kernel.org>, Deucher, Alexander <Alexander.Deucher@amd.com>, Limonciello, Mario <Mario.Limonciello@amd.com>,
Lazar, Lijo <Lijo.Lazar@amd.com><br>
<b>Cc: </b>Zhao, Victor <Victor.Zhao@amd.com>, Chang, HaiJun <HaiJun.Chang@amd.com>, Ma, Qing (Mark) <Qing.Ma@amd.com>, amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>, dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>, linux-pm@vger.kernel.org
<linux-pm@vger.kernel.org>, linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org><br>
<b>Subject: </b>Re: [PATCH 1/3] drm/amdgpu: move GTT to SHM after eviction for hibernation<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:11.0pt">On 30.06.25 12:41, Samuel Zhang wrote:<br>
> When hibernate with data center dGPUs, huge number of VRAM BOs evicted<br>
> to GTT and takes too much system memory. This will cause hibernation<br>
> fail due to insufficient memory for creating the hibernation image.<br>
> <br>
> Move GTT BOs to shmem in KMD, then shmem to swap disk in kernel<br>
> hibernation code to make room for hibernation image.<br>
<br>
This should probably be two patches, one for TTM and then an amdgpu patch to forward the event.<br>
<br>
> <br>
> Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com><br>
> ---<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 ++++++++++++-<br>
> drivers/gpu/drm/ttm/ttm_resource.c | 18 ++++++++++++++++++<br>
> include/drm/ttm/ttm_resource.h | 1 +<br>
> 3 files changed, 31 insertions(+), 1 deletion(-)<br>
> <br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c<br>
> index 4d57269c9ca8..5aede907a591 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c<br>
> @@ -2889,6 +2889,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,<br>
> int amdgpu_ttm_evict_resources(struct amdgpu_device *adev, int mem_type)<br>
> {<br>
> struct ttm_resource_manager *man;<br>
> + int r;<br>
> <br>
> switch (mem_type) {<br>
> case TTM_PL_VRAM:<br>
> @@ -2903,7 +2904,17 @@ int amdgpu_ttm_evict_resources(struct amdgpu_device *adev, int mem_type)<br>
> return -EINVAL;<br>
> }<br>
> <br>
> - return ttm_resource_manager_evict_all(&adev->mman.bdev, man);<br>
> + r = ttm_resource_manager_evict_all(&adev->mman.bdev, man);<br>
> + if (r) {<br>
> + DRM_ERROR("Failed to evict memory type %d\n", mem_type);<br>
> + return r;<br>
> + }<br>
> + if (adev->in_s4 && mem_type == TTM_PL_VRAM) {<br>
> + r = ttm_resource_manager_swapout();<br>
> + if (r)<br>
> + DRM_ERROR("Failed to swap out, %d\n", r);<br>
> + }<br>
> + return r;<br>
> }<br>
> <br>
> #if defined(CONFIG_DEBUG_FS)<br>
> diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_resource.c<br>
> index fd41b56e2c66..07b1f5a5afc2 100644<br>
> --- a/drivers/gpu/drm/ttm/ttm_resource.c<br>
> +++ b/drivers/gpu/drm/ttm/ttm_resource.c<br>
> @@ -534,6 +534,24 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,<br>
> }<br>
> EXPORT_SYMBOL(ttm_resource_manager_init);<br>
> <br>
> +int ttm_resource_manager_swapout(void)<br>
<br>
This needs documentation, better placement and a better name.<br>
<br>
First of all put it into ttm_device.c instead of the resource manager.<br>
<br>
Then call it something like ttm_device_prepare_hibernation or similar.<br>
<br>
<br>
> +{<br>
> + struct ttm_operation_ctx ctx = {<br>
> + .interruptible = false,<br>
> + .no_wait_gpu = false,<br>
> + .force_alloc = true<br>
> + };<br>
> + int ret;<br>
> +<br>
> + while (true) {<br>
<br>
Make that:<br>
<br>
do {<br>
ret = ...<br>
} while (ret > 0);<br>
<br>
> + ret = ttm_global_swapout(&ctx, GFP_KERNEL);<br>
> + if (ret <= 0)<br>
> + break;<br>
> + }<br>
> + return ret;<br>
<br>
It's rather pointless to return the number of swapped out pages.<br>
<br>
Make that "return ret < 0 ? ret : 0;<br>
<br>
Regards,<br>
Christian.<br>
<br>
> +}<br>
> +EXPORT_SYMBOL(ttm_resource_manager_swapout);<br>
> +<br>
> /*<br>
> * ttm_resource_manager_evict_all<br>
> *<br>
> diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h<br>
> index b873be9597e2..46181758068e 100644<br>
> --- a/include/drm/ttm/ttm_resource.h<br>
> +++ b/include/drm/ttm/ttm_resource.h<br>
> @@ -463,6 +463,7 @@ void ttm_resource_manager_init(struct ttm_resource_manager *man,<br>
> <br>
> int ttm_resource_manager_evict_all(struct ttm_device *bdev,<br>
> struct ttm_resource_manager *man);<br>
> +int ttm_resource_manager_swapout(void);<br>
> <br>
> uint64_t ttm_resource_manager_usage(struct ttm_resource_manager *man);<br>
> void ttm_resource_manager_debug(struct ttm_resource_manager *man,<o:p></o:p></span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>