[Intel-gfx] [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2

Daniel Vetter daniel at ffwll.ch
Wed Oct 9 14:09:12 UTC 2019


On Wed, Oct 09, 2019 at 03:10:09PM +0200, Christian König wrote:
> Am 08.10.19 um 11:25 schrieb Daniel Vetter:
> > On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote:
> > > This way we can even pipeline imported BO evictions.
> > > 
> > > v2: Limit this to only cases when the parent object uses a separate
> > >      reservation object as well. This fixes another OOM problem.
> > > 
> > > Signed-off-by: Christian König <christian.koenig at amd.com>
> > Since I read quite a bit of ttm I figured I'll review this too, but I'm
> > totally lost. And git blame gives me at best commits with one-liner commit
> > messages, and the docs aren't explaining much at all either (and generally
> > they didn't get updated at all with all the changes in the past years).
> > 
> > I have a vague idea of what you're doing here, but not enough to do review
> > with any confidence. And from other ttm patches from amd it feels a lot
> > like we have essentially a bus factor of 1 for all things ttm :-/
> 
> Yeah, that's one of a couple of reasons why I want to get rid of TTM in the
> long term.
> 
> Basically this is a bug fix for delay freeing ttm objects. When we hang the
> ttm object on a ghost object to be freed and the ttm object is an imported
> DMA-buf we run into the problem that we want to drop the mapping, but have
> the wrong lock taken (the lock of the ghost and not of the parent).

Got intrigued, did some more digging, I guess the bugfix part is related
to:

commit 841e763b40764a7699ae07f4cb1921af62d6316d
Author: Christian König <christian.koenig at amd.com>
Date:   Thu Jul 20 20:55:06 2017 +0200

    drm/ttm: individualize BO reservation obj when they are freed

and that's why you switch everything over to useing _resv instead of the
pointer. But then I still don't follow the details ...

> 

> Regards,
> Christian.
> 
> > -Daniel
> > 
> > > ---
> > >   drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
> > >   1 file changed, 9 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > index fe81c565e7ef..2ebe9fe7f6c8 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
> > >   	kref_init(&fbo->base.kref);
> > >   	fbo->base.destroy = &ttm_transfered_destroy;
> > >   	fbo->base.acc_size = 0;
> > > -	fbo->base.base.resv = &fbo->base.base._resv;
> > > +	if (bo->base.resv == &bo->base._resv)
> > > +		fbo->base.base.resv = &fbo->base.base._resv;

I got confused a bit at first, until I spotted the

	fbo->base = *bo;

somewhere above. So I think that part makes sense, together with the above
cited patch. I think at least, confidence on this is very low ...

> > > +
> > >   	dma_resv_init(fbo->base.base.resv);
> > >   	ret = dma_resv_trylock(fbo->base.base.resv);

Shouldn't this be switched over to _resv too? Otherwise feels like
unbalanced locking.

> > >   	WARN_ON(!ret);
> > > @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > >   		if (ret)
> > >   			return ret;
> > > -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > >   		/**
> > >   		 * If we're not moving to fixed memory, the TTM object
> > > @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > >   		else
> > >   			bo->ttm = NULL;
> > > -		ttm_bo_unreserve(ghost_obj);
> > > +		dma_resv_unlock(&ghost_obj->base._resv);
> > >   		ttm_bo_put(ghost_obj);
> > >   	}
> > > @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > >   		if (ret)
> > >   			return ret;
> > > -		dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > +		dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > >   		/**
> > >   		 * If we're not moving to fixed memory, the TTM object
> > > @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > >   		else
> > >   			bo->ttm = NULL;
> > > -		ttm_bo_unreserve(ghost_obj);
> > > +		dma_resv_unlock(&ghost_obj->base._resv);

I guess dropping the lru part here (aside from switching from ->resv to
->_resv, which is your bugfix I think) doesn't matter since the ghost
object got all cleared up and isn't on any lists anyway? Otoh how does it
work then ...

Not clear to me why this is safe.

> > >   		ttm_bo_put(ghost_obj);
> > >   	} else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
> > > @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > >   	if (ret)
> > >   		return ret;
> > > -	ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
> > > +	ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
> > >   	/* Last resort, wait for the BO to be idle when we are OOM */
> > >   	if (ret)
> > >   		ttm_bo_wait(bo, false, false);
> > > @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > >   	bo->mem.mem_type = TTM_PL_SYSTEM;
> > >   	bo->ttm = NULL;
> > > -	ttm_bo_unreserve(ghost);
> > > +	dma_resv_unlock(&ghost->base._resv);
> > >   	ttm_bo_put(ghost);
> > >   	return 0;
> > > -- 
> > > 2.17.1
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list