<div dir="ltr">I've just tested it and it seem to fix my issue<div><br></div><div>Feel free to add my </div><div><br></div><div>Tested-by: Mike Lothian <<a href="mailto:mike@fireburn.co.uk">mike@fireburn.co.uk</a>></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 2 Aug 2021 at 14:35, Alex Deucher <<a href="mailto:alexdeucher@gmail.com">alexdeucher@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Aug 2, 2021 at 4:23 AM Chen, Guchun <<a href="mailto:Guchun.Chen@amd.com" target="_blank">Guchun.Chen@amd.com</a>> wrote:<br>
><br>
> [Public]<br>
><br>
> Thank you, Christian.<br>
><br>
> Regarding fence_drv.initialized, it looks to a bit redundant, anyway let me look into this more.<br>
<br>
Does this patch fix this bug?<br>
<a href="https://gitlab.freedesktop.org/drm/amd/-/issues/1668" rel="noreferrer" target="_blank">https://gitlab.freedesktop.org/drm/amd/-/issues/1668</a><br>
<br>
If so, please add:<br>
Bug: <a href="https://gitlab.freedesktop.org/drm/amd/-/issues/1668" rel="noreferrer" target="_blank">https://gitlab.freedesktop.org/drm/amd/-/issues/1668</a><br>
to the commit message.<br>
<br>
Alex<br>
<br>
><br>
> Regards,<br>
> Guchun<br>
><br>
> -----Original Message-----<br>
> From: Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com" target="_blank">ckoenig.leichtzumerken@gmail.com</a>><br>
> Sent: Monday, August 2, 2021 2:56 PM<br>
> To: Chen, Guchun <<a href="mailto:Guchun.Chen@amd.com" target="_blank">Guchun.Chen@amd.com</a>>; <a href="mailto:amd-gfx@lists.freedesktop.org" target="_blank">amd-gfx@lists.freedesktop.org</a>; Gao, Likun <<a href="mailto:Likun.Gao@amd.com" target="_blank">Likun.Gao@amd.com</a>>; Koenig, Christian <<a href="mailto:Christian.Koenig@amd.com" target="_blank">Christian.Koenig@amd.com</a>>; Zhang, Hawking <<a href="mailto:Hawking.Zhang@amd.com" target="_blank">Hawking.Zhang@amd.com</a>>; Deucher, Alexander <<a href="mailto:Alexander.Deucher@amd.com" target="_blank">Alexander.Deucher@amd.com</a>><br>
> Subject: Re: [PATCH] drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)<br>
><br>
> Am 02.08.21 um 07:16 schrieb Guchun Chen:<br>
> > In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop<br>
> > scheduler in s3 test, otherwise, fence related failure will arrive<br>
> > after resume. To fix this and for a better clean up, move<br>
> > drm_sched_fini from fence_hw_fini to fence_sw_fini, as it's part of<br>
> > driver shutdown, and should never be called in hw_fini.<br>
> ><br>
> > v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init, to<br>
> > keep sw_init and sw_fini paired.<br>
> ><br>
> > Fixes: cd87a6dcf6af drm/amdgpu: adjust fence driver enable sequence<br>
> > Suggested-by: Christian König <<a href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>><br>
> > Signed-off-by: Guchun Chen <<a href="mailto:guchun.chen@amd.com" target="_blank">guchun.chen@amd.com</a>><br>
><br>
> It's a bit ambiguous now what fence_drv.initialized means, but I think we can live with that for now.<br>
><br>
> Patch is Reviewed-by: Christian König <<a href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>>.<br>
><br>
> Regards,<br>
> Christian.<br>
><br>
> > ---<br>
> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 ++---<br>
> > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 12 +++++++-----<br>
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 ++--<br>
> > 3 files changed, 11 insertions(+), 10 deletions(-)<br>
> ><br>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
> > index b1d2dc39e8be..9e53ff851496 100644<br>
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c<br>
> > @@ -3646,9 +3646,9 @@ int amdgpu_device_init(struct amdgpu_device<br>
> > *adev,<br>
> ><br>
> > fence_driver_init:<br>
> > /* Fence driver */<br>
> > - r = amdgpu_fence_driver_init(adev);<br>
> > + r = amdgpu_fence_driver_sw_init(adev);<br>
> > if (r) {<br>
> > - dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");<br>
> > + dev_err(adev->dev, "amdgpu_fence_driver_sw_init failed\n");<br>
> > amdgpu_vf_error_put(adev, AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 0, 0);<br>
> > goto failed;<br>
> > }<br>
> > @@ -3988,7 +3988,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)<br>
> > }<br>
> > amdgpu_fence_driver_hw_init(adev);<br>
> ><br>
> > -<br>
> > r = amdgpu_device_ip_late_init(adev);<br>
> > if (r)<br>
> > return r;<br>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br>
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br>
> > index 49c5c7331c53..7495911516c2 100644<br>
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br>
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c<br>
> > @@ -498,7 +498,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,<br>
> > }<br>
> ><br>
> > /**<br>
> > - * amdgpu_fence_driver_init - init the fence driver<br>
> > + * amdgpu_fence_driver_sw_init - init the fence driver<br>
> > * for all possible rings.<br>
> > *<br>
> > * @adev: amdgpu device pointer<br>
> > @@ -509,13 +509,13 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,<br>
> > * amdgpu_fence_driver_start_ring().<br>
> > * Returns 0 for success.<br>
> > */<br>
> > -int amdgpu_fence_driver_init(struct amdgpu_device *adev)<br>
> > +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev)<br>
> > {<br>
> > return 0;<br>
> > }<br>
> ><br>
> > /**<br>
> > - * amdgpu_fence_driver_fini - tear down the fence driver<br>
> > + * amdgpu_fence_driver_hw_fini - tear down the fence driver<br>
> > * for all possible rings.<br>
> > *<br>
> > * @adev: amdgpu device pointer<br>
> > @@ -531,8 +531,7 @@ void amdgpu_fence_driver_hw_fini(struct<br>
> > amdgpu_device *adev)<br>
> ><br>
> > if (!ring || !ring->fence_drv.initialized)<br>
> > continue;<br>
> > - if (!ring->no_scheduler)<br>
> > - drm_sched_fini(&ring->sched);<br>
> > +<br>
> > /* You can't wait for HW to signal if it's gone */<br>
> > if (!drm_dev_is_unplugged(&adev->ddev))<br>
> > r = amdgpu_fence_wait_empty(ring); @@ -560,6 +559,9 @@ void<br>
> > amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev)<br>
> > if (!ring || !ring->fence_drv.initialized)<br>
> > continue;<br>
> ><br>
> > + if (!ring->no_scheduler)<br>
> > + drm_sched_fini(&ring->sched);<br>
> > +<br>
> > for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)<br>
> > dma_fence_put(ring->fence_drv.fences[j]);<br>
> > kfree(ring->fence_drv.fences);<br>
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> > index 27adffa7658d..9c11ced4312c 100644<br>
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h<br>
> > @@ -106,7 +106,6 @@ struct amdgpu_fence_driver {<br>
> > struct dma_fence **fences;<br>
> > };<br>
> ><br>
> > -int amdgpu_fence_driver_init(struct amdgpu_device *adev);<br>
> > void amdgpu_fence_driver_force_completion(struct amdgpu_ring *ring);<br>
> ><br>
> > int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, @@<br>
> > -115,9 +114,10 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,<br>
> > int amdgpu_fence_driver_start_ring(struct amdgpu_ring *ring,<br>
> > struct amdgpu_irq_src *irq_src,<br>
> > unsigned irq_type);<br>
> > +void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev);<br>
> > void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev);<br>
> > +int amdgpu_fence_driver_sw_init(struct amdgpu_device *adev);<br>
> > void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev); -void<br>
> > amdgpu_fence_driver_hw_init(struct amdgpu_device *adev);<br>
> > int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **fence,<br>
> > unsigned flags);<br>
> > int amdgpu_fence_emit_polling(struct amdgpu_ring *ring, uint32_t *s,<br>
</blockquote></div>