[PATCH] dma-buf/sync_file: Always increment refcount when merging fences.

Gustavo Padovan gustavo.padovan at collabora.com
Wed Sep 14 16:38:59 UTC 2016


2016-09-14 Rafael Antognolli <rafael.antognolli at intel.com>:

> On Wed, Sep 14, 2016 at 11:04:01AM -0300, Gustavo Padovan wrote:
> > 2016-09-14 Chris Wilson <chris at chris-wilson.co.uk>:
> > 
> > > On Tue, Sep 13, 2016 at 04:24:27PM -0700, Rafael Antognolli wrote:
> > > > The refcount of a fence should be increased whenever it is added to a merged
> > > > fence, since it will later be decreased when the merged fence is destroyed.
> > > > Failing to do so will cause the original fence to be freed if the merged fence
> > > > gets freed, but other places still referencing won't know about it.
> > > > 
> > > > This patch fixes a kernel panic that can be triggered by creating a fence that
> > > > is expired (or increasing the timeline until it expires), then creating a
> > > > merged fence out of it, and deleting the merged fence. This will make the
> > > > original expired fence's refcount go to zero.
> > > > 
> > > > Signed-off-by: Rafael Antognolli <rafael.antognolli at intel.com>
> > > > ---
> > > > 
> > > > Sample code to trigger the mentioned kernel panic (might need to be executed a
> > > > couple times before it actually breaks everything):
> > > > 
> > > > static void test_sync_expired_merge(void)
> > > > {
> > > >        int iterations = 1 << 20;
> > > >        int timeline;
> > > >        int i;
> > > >        int fence_expired, fence_merged;
> > > > 
> > > >        timeline = sw_sync_timeline_create();
> > > > 
> > > >        sw_sync_timeline_inc(timeline, 100);
> > > >        fence_expired = sw_sync_fence_create(timeline, 1);
> > > >        fence_merged = sw_sync_merge(fence_expired, fence_expired);
> > > >        sw_sync_fence_destroy(fence_merged);
> > > > 
> > > >        for (i = 0; i < iterations; i++) {
> > > >                int fence = sw_sync_merge(fence_expired, fence_expired);
> > > > 
> > > >                igt_assert_f(sw_sync_wait(fence, -1) > 0,
> > > >                                     "Failure waiting on fence\n");
> > > >                sw_sync_fence_destroy(fence);
> > > >        }
> > > > 
> > > >        sw_sync_fence_destroy(fence_expired);
> > > > }
> > > > 
> > > >  drivers/dma-buf/sync_file.c | 7 ++-----
> > > >  1 file changed, 2 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > index 486d29c..6ce6b8f 100644
> > > > --- a/drivers/dma-buf/sync_file.c
> > > > +++ b/drivers/dma-buf/sync_file.c
> > > > @@ -178,11 +178,8 @@ static struct fence **get_fences(struct sync_file *sync_file, int *num_fences)
> > > >  static void add_fence(struct fence **fences, int *i, struct fence *fence)
> > > >  {
> > > >  	fences[*i] = fence;
> > > > -
> > > > -	if (!fence_is_signaled(fence)) {
> > > > -		fence_get(fence);
> > > > -		(*i)++;
> > > > -	}
> > > > +	fence_get(fence);
> > > > +	(*i)++;
> > > >  }
> > > 
> > > I think you'll find it's the caller:
> > > 
> > > if (i == 0) {
> > > 	add_fence(fences, &i, a_fences[0]);
> > > 	i++;
> > > }
> > > 
> > > that does the unexpected.
> > > 
> > > This should be 
> > > 
> > > if (i == 0)
> > > 	fences[i++] = fence_get(a_fences[0]);
> > 
> > You are right. Otherwise we would miss the extra reference. i == 0 here
> > means that all fences are signalled.
> > 
> > > 
> > > 
> > > That ensures the sync_file inherits the signaled status without having
> > > to keep all fences.
> 
> OK, so if you are building a merged fence out of several signaled
> fences, then you only keep one of them instead? I haven't noticed that
> was the intention.
> 
> > > I think there still seems to be a memory leak when calling
> > > sync_file_set_fence() here with i == 1.
> > 
> > I think that is something we discussed already, we don't hold an extra
> > ref there for i == 1 because it would leak the fence.
> > 
> 
> I agree, we already increase the reference count in add_fence anyway (or
> here, assuming we do what Chris suggested. But I also believe that's the
> cause of confusion.
> 
> IMHO it would be better to call fence_get() inside fence_array_create(),
> so it would match the fence_put in fence_array_release(). And if we have
> only one fence, fence_get it inside sync_file_set_fence().

sync_file is not the only user of fence_array, so I would not push this
behaviour changes to others users. The current behaviour is well
documented inside sync_file_set_fence, not sure we need to change it.

Gustavo


More information about the dri-devel mailing list