[PATCH] drm/amd/amdgpu: Add mutex to protect IB resources
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Nov 6 09:35:04 UTC 2019
Am 06.11.19 um 07:56 schrieb Jesse Zhang:
> Unloading driver has call trace when unloading happens
> 2s after loading driver.
>
> Since ring test are delayed after initializing driver,
> it is possible that driver has been unloaded before or
> while doing ring test.
>
> Add mutex to prevent ring test and driver finalization
> occurs at the same time and check before doing ring test
> if required resources still exist.
NAK, a mutex is the wrong approach.
You just need to add an flush_delayed_work(&adev->delayed_init_work) to
amdgpu_device_fini().
Regards,
Christian.
>
> Change-Id: I27b52c2c630ad3853c6384e7b6906f0fae590ead
> Signed-off-by: Jesse Zhang <zhexi.zhang at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 9 +++++++++
> 3 files changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 0469cc5..b825ad0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -923,6 +923,7 @@ struct amdgpu_device {
> struct amdgpu_ring *rings[AMDGPU_MAX_RINGS];
> bool ib_pool_ready;
> struct amdgpu_sa_manager ring_tmp_bo;
> + struct mutex ib_lock;
>
> /* interrupts */
> struct amdgpu_irq irq;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index b2f38b0..8adbb25 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2806,6 +2806,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> mutex_init(&adev->lock_reset);
> mutex_init(&adev->virt.dpm_mutex);
> mutex_init(&adev->psp.mutex);
> + mutex_init(&adev->ib_lock);
>
> r = amdgpu_device_check_arguments(adev);
> if (r)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index d121bbd..ef5339e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -317,10 +317,12 @@ int amdgpu_ib_pool_init(struct amdgpu_device *adev)
> */
> void amdgpu_ib_pool_fini(struct amdgpu_device *adev)
> {
> + mutex_lock(&adev->ib_lock);
> if (adev->ib_pool_ready) {
> amdgpu_sa_bo_manager_fini(adev, &adev->ring_tmp_bo);
> adev->ib_pool_ready = false;
> }
> + mutex_unlock(&adev->ib_lock);
> }
>
> /**
> @@ -364,6 +366,11 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
> struct amdgpu_ring *ring = adev->rings[i];
> long tmo;
>
> + if (adev->shutdown) {
> + DRM_INFO("Device finalized, skip unfinished ring test\n");
> + return 0;
> + }
> +
> /* KIQ rings don't have an IB test because we never submit IBs
> * to them and they have no interrupt support.
> */
> @@ -381,7 +388,9 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
> else
> tmo = tmo_gfx;
>
> + mutex_lock(&adev->ib_lock);
> r = amdgpu_ring_test_ib(ring, tmo);
> + mutex_unlock(&adev->ib_lock);
> if (!r) {
> DRM_DEV_DEBUG(adev->dev, "ib test on %s succeeded\n",
> ring->name);
More information about the amd-gfx
mailing list