[PATCH] drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled
Chai, Thomas
YiPeng.Chai at amd.com
Wed Aug 17 02:54:31 UTC 2022
[AMD Official Use Only - General]
Hi Alex:
When removing an amdgpu device, it may be difficult to change the order of psp_hw_fini calls.
1. The drm_dev_unplug call is at the beginning in the amdgpu_pci_remove function, which makes the gpu device inaccessible for userspace operations. If the call to psp_hw_fini was moved before drm_dev_unplug, userspace could access the gpu device but the psp might be removing. It has unknown issues.
2. psp_hw_fini is called by the .hw_fini iterator in amdgpu_device_ip_fini_early, referring to the code starting from amdgpu_pci_remove to .hw_fini is called,
there are many preparatory operations before calling .hw_fini, which makes it very difficult to change the order of psp_hw_fini or all block .hw_fini.
So can we do a workaround in psp_cmd_submit_buf when removing amdgpu device?
-----Original Message-----
From: Alex Deucher <alexdeucher at gmail.com>
Sent: Monday, August 15, 2022 10:22 PM
To: Chai, Thomas <YiPeng.Chai at amd.com>
Cc: amd-gfx at lists.freedesktop.org; Zhang, Hawking <Hawking.Zhang at amd.com>; Chen, Guchun <Guchun.Chen at amd.com>; Chai, Thomas <YiPeng.Chai at amd.com>
Subject: Re: [PATCH] drm/amdgpu: TA unload messages are not actually sent to psp when amdgpu is uninstalled
On Mon, Aug 15, 2022 at 3:06 AM YiPeng Chai <YiPeng.Chai at amd.com> wrote:
>
> The psp_cmd_submit_buf function is called by psp_hw_fini to send TA
> unload messages to psp to terminate ras, asd and tmr.
> But when amdgpu is uninstalled, drm_dev_unplug is called earlier than
> psp_hw_fini in amdgpu_pci_remove, the calling order as follows:
> static void amdgpu_pci_remove(struct pci_dev *pdev) {
> drm_dev_unplug
> ......
> amdgpu_driver_unload_kms->amdgpu_device_fini_hw->...
> ->.hw_fini->psp_hw_fini->...
> ->psp_ta_unload->psp_cmd_submit_buf
> ......
> }
> The program will return when calling drm_dev_enter in
> psp_cmd_submit_buf.
>
> So the call to drm_dev_enter in psp_cmd_submit_buf should be removed,
> so that the TA unload messages can be sent to the psp when amdgpu is
> uninstalled.
>
> Signed-off-by: YiPeng Chai <YiPeng.Chai at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index b067ce45d226..0578d8d094a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -585,9 +585,6 @@ psp_cmd_submit_buf(struct psp_context *psp,
> if (psp->adev->no_hw_access)
> return 0;
>
> - if (!drm_dev_enter(adev_to_drm(psp->adev), &idx))
> - return 0;
> -
This check is to prevent the hardware from being accessed if the card is removed. I think we need to fix the ordering elsewhere.
Alex
> memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
>
> memcpy(psp->cmd_buf_mem, cmd, sizeof(struct
> psp_gfx_cmd_resp)); @@ -651,7 +648,6 @@ psp_cmd_submit_buf(struct psp_context *psp,
> }
>
> exit:
> - drm_dev_exit(idx);
> return ret;
> }
>
> --
> 2.25.1
>
More information about the amd-gfx
mailing list