[PATCH] drm/amdgpu: add rcu_barrier after entity fini
Deng, Emily
Emily.Deng at amd.com
Fri May 18 03:20:11 UTC 2018
Hi Christian,
Yes, it has already one rcu_barrier, but it has called twice call_rcu, so the one rcu_barrier just could barrier one call_rcu some time.
After I added another rcu_barrier, the kernel issue will disappear.
Best Wishes,
Emily Deng
> -----Original Message-----
> From: Christian König [mailto:ckoenig.leichtzumerken at gmail.com]
> Sent: Thursday, May 17, 2018 7:08 PM
> To: Deng, Emily <Emily.Deng at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: add rcu_barrier after entity fini
>
> Am 17.05.2018 um 12:03 schrieb Emily Deng:
> > To free the fence from the amdgpu_fence_slab, need twice call_rcu, to
> > avoid the amdgpu_fence_slab_fini call
> > kmem_cache_destroy(amdgpu_fence_slab) before
> kmem_cache_free(amdgpu_fence_slab, fence), add rcu_barrier after
> drm_sched_entity_fini.
> >
> > The kmem_cache_free(amdgpu_fence_slab, fence)'s call trace as below:
> > 1.drm_sched_entity_fini ->
> > drm_sched_entity_cleanup ->
> > dma_fence_put(entity->last_scheduled) ->
> > drm_sched_fence_release_finished ->
> drm_sched_fence_release_scheduled
> > -> call_rcu(&fence->finished.rcu, drm_sched_fence_free)
> >
> > 2.drm_sched_fence_free ->
> > dma_fence_put(fence->parent) ->
> > amdgpu_fence_release ->
> > call_rcu(&f->rcu, amdgpu_fence_free) ->
> > kmem_cache_free(amdgpu_fence_slab, fence);
> >
> > v2:put the barrier before the kmem_cache_destroy
> >
> > Change-Id: I8dcadd3372f97e72461bf46b41cc26d90f09b8df
> > Signed-off-by: Emily Deng <Emily.Deng at amd.com>
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > index 39ec6b8..42be65b 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> > @@ -69,6 +69,7 @@ int amdgpu_fence_slab_init(void)
> > void amdgpu_fence_slab_fini(void)
> > {
> > rcu_barrier();
> > + rcu_barrier();
>
> Well, you should have noted that there is already an rcu_barrier here and
> adding another one shouldn't have any additional effect. So your explanation
> and the proposed solution doesn't make to much sense.
>
> I think the problem you run into is rather that the fence is reference counted
> and might live longer than the module who created it.
>
> Complicated issue, one possible solution would be to release
> fence->parent earlier in the scheduler fence but that doesn't sound like
> a general purpose solution.
>
> Christian.
>
> > kmem_cache_destroy(amdgpu_fence_slab);
> > }
> > /*
More information about the amd-gfx
mailing list