[PATCH v2 2/3] drm/amdgpu: Handle xgmi device removal.

Alex Deucher alexdeucher at gmail.com
Fri Nov 30 19:49:58 UTC 2018


On Fri, Nov 30, 2018 at 1:17 PM Andrey Grodzovsky
<andrey.grodzovsky at amd.com> wrote:
>
> XGMI hive has some resources allocted on device init which
> needs to be deallocated when the device is unregistered.
>
> v2: Remove creation of dedicated wq for XGMI hive reset.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 20 ++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  1 +
>  3 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c75badf..bfd286c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1864,6 +1864,9 @@ static int amdgpu_device_ip_fini(struct amdgpu_device *adev)
>  {
>         int i, r;
>
> +       if (adev->gmc.xgmi.num_physical_nodes > 1)
> +               amdgpu_xgmi_remove_device(adev);
> +
>         amdgpu_amdkfd_device_fini(adev);
>
>         amdgpu_device_set_pg_state(adev, AMD_PG_STATE_UNGATE);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index fb37e69..38e1599 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -135,3 +135,23 @@ int amdgpu_xgmi_add_device(struct amdgpu_device *adev)
>         mutex_unlock(&xgmi_mutex);
>         return ret;
>  }
> +
> +void amdgpu_xgmi_remove_device(struct amdgpu_device *adev)
> +{
> +       struct amdgpu_hive_info *hive;
> +
> +       if ((adev->asic_type < CHIP_VEGA20) || (adev->flags & AMD_IS_APU))
> +               return;

It would be nice to have something better here to check against.  This
seems kind of fragile.  Can we check based on some xgmi related
structure?

Alex

> +
> +       mutex_lock(&xgmi_mutex);
> +
> +       hive = amdgpu_get_xgmi_hive(adev);
> +       if (!hive)
> +               goto exit;
> +
> +       if (!(hive->number_devices--))
> +               mutex_destroy(&hive->hive_lock);
> +
> +exit:
> +       mutex_unlock(&xgmi_mutex);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> index 6335bfd..6151eb9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> @@ -35,5 +35,6 @@ struct amdgpu_hive_info {
>  struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev);
>  int amdgpu_xgmi_update_topology(struct amdgpu_hive_info *hive, struct amdgpu_device *adev);
>  int amdgpu_xgmi_add_device(struct amdgpu_device *adev);
> +void amdgpu_xgmi_remove_device(struct amdgpu_device *adev);
>
>  #endif
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list