[PATCH] drm/amdgpu fix incorrect sysfs remove behavior for xgmi
Christian König
ckoenig.leichtzumerken at gmail.com
Mon May 18 07:12:33 UTC 2020
Am 18.05.20 um 06:44 schrieb Jack Zhang:
> Under xgmi setup,some sysfs fail to create for the second time of kmd
> driver loading. It's due to sysfs nodes are not removed appropriately
> in the last unlod time.
>
> Changes of this patch:
> 1. remove sysfs for dev_attr_xgmi_error
> 2. remove sysfs_link adev->dev->kobj with target name.
> And it only needs to be removed once for a xgmi setup
> 3. remove sysfs_link hive->kobj with target name
>
> In amdgpu_xgmi_remove_device:
> 1. amdgpu_xgmi_sysfs_rem_dev_info needs to be run per device
> 2. amdgpu_xgmi_sysfs_destroy needs to be run on the last node of
> device.
>
> Signed-off-by: Jack Zhang <Jack.Zhang1 at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 22 +++++++++++++++-------
> 1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index e9e59bc..bfe2468 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -325,9 +325,17 @@ static int amdgpu_xgmi_sysfs_add_dev_info(struct amdgpu_device *adev,
> static void amdgpu_xgmi_sysfs_rem_dev_info(struct amdgpu_device *adev,
> struct amdgpu_hive_info *hive)
> {
> + char node[10] = { 0 };
Please don't initialize things like this, use memset() instead.
Regards,
Christian.
> device_remove_file(adev->dev, &dev_attr_xgmi_device_id);
> - sysfs_remove_link(&adev->dev->kobj, adev->ddev->unique);
> - sysfs_remove_link(hive->kobj, adev->ddev->unique);
> + device_remove_file(adev->dev, &dev_attr_xgmi_error);
> +
> + if (adev != hive->adev) {
> + sysfs_remove_link(&adev->dev->kobj,"xgmi_hive_info");
> + }
> +
> + sprintf(node, "node%d", hive->number_devices);
> + sysfs_remove_link(hive->kobj, node);
> +
> }
>
>
> @@ -583,14 +591,14 @@ int amdgpu_xgmi_remove_device(struct amdgpu_device *adev)
> if (!hive)
> return -EINVAL;
>
> - if (!(hive->number_devices--)) {
> + task_barrier_rem_task(&hive->tb);
> + amdgpu_xgmi_sysfs_rem_dev_info(adev, hive);
> + mutex_unlock(&hive->hive_lock);
> +
> + if(!(--hive->number_devices)){
> amdgpu_xgmi_sysfs_destroy(adev, hive);
> mutex_destroy(&hive->hive_lock);
> mutex_destroy(&hive->reset_lock);
> - } else {
> - task_barrier_rem_task(&hive->tb);
> - amdgpu_xgmi_sysfs_rem_dev_info(adev, hive);
> - mutex_unlock(&hive->hive_lock);
> }
>
> return psp_xgmi_terminate(&adev->psp);
More information about the amd-gfx
mailing list