<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black;}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:windowtext">Hi Jack,<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:10.0pt"><span style="color:windowtext">Could you please give a try about this? Both for passthrough and sriov.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">Best wishes<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">Emily Deng<o:p></o:p></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> Koenig, Christian <Christian.Koenig@amd.com>
<br>
<b>Sent:</b> Wednesday, September 18, 2019 6:47 PM<br>
<b>To:</b> Deng, Emily <Emily.Deng@amd.com><br>
<b>Cc:</b> Zhang, Jack (Jian) <Jack.Zhang1@amd.com>; amd-gfx@lists.freedesktop.org; Teng, Rui <Rui.Teng@amd.com>; Cui, Flora <Flora.Cui@amd.com><br>
<b>Subject:</b> Re: [PATCH] drm/amdgpu/sriov: omit fbcon error under sriov or passthrough<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Hi Jack & Emily,<br>
<br>
asking around a bit helped, we should be able to take a look at the module state to distinct the two use cases like this:<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
index 6b96a5738e57..0af134eb03e2 100644<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
@@ -1074,7 +1074,10 @@ amdgpu_pci_remove(struct pci_dev *pdev)<br>
{<br>
struct drm_device *dev = pci_get_drvdata(pdev);<br>
<br>
- DRM_ERROR("Device removal is currently not supported outside of fbcon\n");<br>
+#ifdef MODULE<br>
+ if (THIS_MODULE->state != MODULE_STATE_GOING)<br>
+#endif<br>
+ DRM_ERROR("Device removal is currently not supported outside of fbcon\n");<br>
drm_dev_unplug(dev);<br>
drm_dev_put(dev);<br>
pci_disable_device(pdev);<br>
<br>
It's a bit of a hack, but I think that this should work.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 18.09.19 um 12:29 schrieb Christian König:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">Hi Emily,<br>
<br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal">Do you think this is because the wrong use case?<o:p></o:p></p>
</blockquote>
<p class="MsoNormal">Well Jack's use case is correct, but the PCIe hot plug removal use case is incorrect.<br>
<br>
Changing it to a warning is most likely not a good idea either because it is indeed an error to try to use PCIe hot plug removal.<br>
<br>
What we need to distinct is why the function is called, if it's because of pci_unregister_driver(&amdgpu_kms_pci_driver) in amdgpu_exit() then the use case is valid and we should not print the error.<br>
<br>
But if it's because somebody does something like "echo 1 > /sys/bus/pci/devices/0000\:0b\:00.1/remove" then that is invalid and we should print it.<br>
<br>
We could do some hack and look at the stack trace, but that is probably not reliable either.<br>
<br>
Maybe we can look at the module reference count or something like that.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 18.09.19 um 12:04 schrieb Deng, Emily:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal">Hi Christian,<o:p></o:p></p>
<p class="MsoNormal" style="text-indent:10.0pt">Do you think this is because the wrong use case? Even we add the error log, the issue still happens. Could we change this error to warning? As for the right method to remove the driver, it shouldn’t occur issues.<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">Best wishes<o:p></o:p></p>
<p class="MsoNormal">Emily Deng<o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Koenig, Christian <a href="mailto:Christian.Koenig@amd.com">
<Christian.Koenig@amd.com></a> <br>
<b>Sent:</b> Wednesday, September 18, 2019 5:45 PM<br>
<b>To:</b> Deng, Emily <a href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a><br>
<b>Cc:</b> Zhang, Jack (Jian) <a href="mailto:Jack.Zhang1@amd.com"><Jack.Zhang1@amd.com></a>;
<a href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Teng, Rui
<a href="mailto:Rui.Teng@amd.com"><Rui.Teng@amd.com></a>; Cui, Flora <a href="mailto:Flora.Cui@amd.com">
<Flora.Cui@amd.com></a><br>
<b>Subject:</b> RE: [PATCH] drm/amdgpu/sriov: omit fbcon error under sriov or passthrough<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal">Yes, exactly.<o:p></o:p></p>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">The problem is that even when output is qxl or the Intel driver our driver is still loaded and forcefully removing it renders the desktop unusable.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Christian.<o:p></o:p></p>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<p class="MsoNormal">Am 18.09.2019 11:24 schrieb "Deng, Emily" <<a href="mailto:Emily.Deng@amd.com">Emily.Deng@amd.com</a>>:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p>Hi Christian,<o:p></o:p></p>
<p style="text-indent:10.0pt">Do you mean, for passthrough mode, the desktop is rendered by our asic, but enduser is trying to remove the driver by hot plug?<o:p></o:p></p>
<p> <o:p></o:p></p>
<p>Best wishes<o:p></o:p></p>
<p>Emily Deng<o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p><b>From:</b> Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com">Christian.Koenig@amd.com</a>>
<br>
<b>Sent:</b> Wednesday, September 18, 2019 4:44 PM<br>
<b>To:</b> Deng, Emily <<a href="mailto:Emily.Deng@amd.com">Emily.Deng@amd.com</a>><br>
<b>Cc:</b> Zhang, Jack (Jian) <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>>;
<a href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Teng, Rui <<a href="mailto:Rui.Teng@amd.com">Rui.Teng@amd.com</a>>; Cui, Flora <<a href="mailto:Flora.Cui@amd.com">Flora.Cui@amd.com</a>><br>
<b>Subject:</b> RE: [PATCH] drm/amdgpu/sriov: omit fbcon error under sriov or passthrough<o:p></o:p></p>
</div>
</div>
<p> <o:p></o:p></p>
<div>
<div>
<p>Hi Emily,<o:p></o:p></p>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>Yeah, exactly that's the problem: In some cases the driver can be unloaded while it is still in use!<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>See we added this error message because endusers tried to use PCIe hot plug to unload the driver to use the hardware for paththrough.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>But this will completely nuke your desktop, even when amdgpu is only a secondary device like in the qxl case.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>Jack is using the correct way of doing it, e.g. using "modprobe -r" or rmmod. Both commands check the use count before unloading the driver instances.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>I don't see a way to distingt the two cases and what Jack is doing is unfortunate not the common one.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>Regards,<o:p></o:p></p>
</div>
<div>
<p>Christian.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
</div>
<div>
<p> <o:p></o:p></p>
<div>
<p>Am 18.09.2019 10:08 schrieb "Deng, Emily" <<a href="mailto:Emily.Deng@amd.com">Emily.Deng@amd.com</a>>:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p>Hi Christian,<o:p></o:p></p>
<p> Before unloading driver, user must sure there is not any userspace to use of amdgpu, if not, it will report driver is in use. And the unloading driver is a feature about amdgpu driver which will be easier to replace driver without rebooting VM. Do you
think it has any issue about driver unloading path?<o:p></o:p></p>
<p> <o:p></o:p></p>
<p>Best wishes<o:p></o:p></p>
<p>Emily Deng<o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p><b>From:</b> Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com">Christian.Koenig@amd.com</a>>
<br>
<b>Sent:</b> Wednesday, September 18, 2019 3:54 PM<br>
<b>To:</b> Zhang, Jack (Jian) <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>><br>
<b>Cc:</b> <a href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>; Teng, Rui <<a href="mailto:Rui.Teng@amd.com">Rui.Teng@amd.com</a>>; Deng, Emily <<a href="mailto:Emily.Deng@amd.com">Emily.Deng@amd.com</a>>; Cui, Flora <<a href="mailto:Flora.Cui@amd.com">Flora.Cui@amd.com</a>><br>
<b>Subject:</b> RE: [PATCH] drm/amdgpu/sriov: omit fbcon error under sriov or passthrough<o:p></o:p></p>
</div>
</div>
<p> <o:p></o:p></p>
<div>
<div>
<p>Hi Jack,<o:p></o:p></p>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>Well that believe is unfortunately completely wrong.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>The point is that ANY use of amdgpu by userspace will prevent correct driver unload, that qxl is used for the fbcon doesn't change anything here.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>So the patch is a clear NAK. Driver unload is not supposed to work even under SRIOV.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p>Regards,<o:p></o:p></p>
</div>
<div>
<p>Christian.<o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
<div>
<p> <o:p></o:p></p>
</div>
</div>
<div>
<p> <o:p></o:p></p>
<div>
<p>Am 18.09.2019 09:32 schrieb "Zhang, Jack (Jian)" <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>>:<o:p></o:p></p>
</div>
</div>
</div>
<div>
<p style="margin-bottom:12.0pt">Hi, Christian and folks,<br>
<br>
In virtual machines(such virt-manager), there's always a virtual graphics device existed like "qxl" as the default gfx device.<br>
So under such condition, we believe it's reasonable to unload amdgpu driver as it is not treated as the default fbcon device.<br>
<br>
Would you please help to review this patch?<br>
<br>
Best wishes,<br>
Jack<br>
<br>
-----Original Message-----<br>
From: Jack Zhang <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>> <br>
Sent: Wednesday, September 18, 2019 3:25 PM<br>
To: <a href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a><br>
Cc: Zhang, Jack (Jian) <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>><br>
Subject: [PATCH] drm/amdgpu/sriov: omit fbcon error under sriov or passthrough<br>
<br>
In virtual machine, there would be a qxl or cirrus graphics device as the default master fbcon device.<br>
<br>
So for PF(passthrough mode) or SRIOV VF, it is reasonable to unload amdgpu driver. Amdgpu doesn't have to be the only fbcon device under this condition.<br>
<br>
Signed-off-by: Jack Zhang <<a href="mailto:Jack.Zhang1@amd.com">Jack.Zhang1@amd.com</a>><br>
---<br>
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 +++--<br>
1 file changed, 3 insertions(+), 2 deletions(-)<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
index 420888e..ada2b25 100644<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c<br>
@@ -1103,8 +1103,9 @@ static void<br>
amdgpu_pci_remove(struct pci_dev *pdev) {<br>
struct drm_device *dev = pci_get_drvdata(pdev);<br>
-<br>
- DRM_ERROR("Device removal is currently not supported outside of fbcon\n");<br>
+ struct amdgpu_device *adev = dev->dev_private;<br>
+ if (!amdgpu_sriov_vf(adev) && !amdgpu_passthrough(adev))<br>
+ DRM_ERROR(" currently not supported outside of <br>
+fbcon\n");<br>
drm_dev_unplug(dev);<br>
drm_dev_put(dev);<br>
pci_disable_device(pdev);<br>
--<br>
2.7.4<o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p> <o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
<p class="MsoNormal"><o:p> </o:p></p>
</blockquote>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</body>
</html>