[Intel-gfx] [PATCH 2/4] drm/ttm: use the parent resv for ghost objects v2
Daniel Vetter
daniel at ffwll.ch
Wed Oct 9 14:09:12 UTC 2019
On Wed, Oct 09, 2019 at 03:10:09PM +0200, Christian König wrote:
> Am 08.10.19 um 11:25 schrieb Daniel Vetter:
> > On Thu, Aug 29, 2019 at 04:29:15PM +0200, Christian König wrote:
> > > This way we can even pipeline imported BO evictions.
> > >
> > > v2: Limit this to only cases when the parent object uses a separate
> > > reservation object as well. This fixes another OOM problem.
> > >
> > > Signed-off-by: Christian König <christian.koenig at amd.com>
> > Since I read quite a bit of ttm I figured I'll review this too, but I'm
> > totally lost. And git blame gives me at best commits with one-liner commit
> > messages, and the docs aren't explaining much at all either (and generally
> > they didn't get updated at all with all the changes in the past years).
> >
> > I have a vague idea of what you're doing here, but not enough to do review
> > with any confidence. And from other ttm patches from amd it feels a lot
> > like we have essentially a bus factor of 1 for all things ttm :-/
>
> Yeah, that's one of a couple of reasons why I want to get rid of TTM in the
> long term.
>
> Basically this is a bug fix for delay freeing ttm objects. When we hang the
> ttm object on a ghost object to be freed and the ttm object is an imported
> DMA-buf we run into the problem that we want to drop the mapping, but have
> the wrong lock taken (the lock of the ghost and not of the parent).
Got intrigued, did some more digging, I guess the bugfix part is related
to:
commit 841e763b40764a7699ae07f4cb1921af62d6316d
Author: Christian König <christian.koenig at amd.com>
Date: Thu Jul 20 20:55:06 2017 +0200
drm/ttm: individualize BO reservation obj when they are freed
and that's why you switch everything over to useing _resv instead of the
pointer. But then I still don't follow the details ...
>
> Regards,
> Christian.
>
> > -Daniel
> >
> > > ---
> > > drivers/gpu/drm/ttm/ttm_bo_util.c | 16 +++++++++-------
> > > 1 file changed, 9 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > index fe81c565e7ef..2ebe9fe7f6c8 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> > > @@ -517,7 +517,9 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo,
> > > kref_init(&fbo->base.kref);
> > > fbo->base.destroy = &ttm_transfered_destroy;
> > > fbo->base.acc_size = 0;
> > > - fbo->base.base.resv = &fbo->base.base._resv;
> > > + if (bo->base.resv == &bo->base._resv)
> > > + fbo->base.base.resv = &fbo->base.base._resv;
I got confused a bit at first, until I spotted the
fbo->base = *bo;
somewhere above. So I think that part makes sense, together with the above
cited patch. I think at least, confidence on this is very low ...
> > > +
> > > dma_resv_init(fbo->base.base.resv);
> > > ret = dma_resv_trylock(fbo->base.base.resv);
Shouldn't this be switched over to _resv too? Otherwise feels like
unbalanced locking.
> > > WARN_ON(!ret);
> > > @@ -716,7 +718,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > > if (ret)
> > > return ret;
> > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > > /**
> > > * If we're not moving to fixed memory, the TTM object
> > > @@ -729,7 +731,7 @@ int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > > else
> > > bo->ttm = NULL;
> > > - ttm_bo_unreserve(ghost_obj);
> > > + dma_resv_unlock(&ghost_obj->base._resv);
> > > ttm_bo_put(ghost_obj);
> > > }
> > > @@ -772,7 +774,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > > if (ret)
> > > return ret;
> > > - dma_resv_add_excl_fence(ghost_obj->base.resv, fence);
> > > + dma_resv_add_excl_fence(&ghost_obj->base._resv, fence);
> > > /**
> > > * If we're not moving to fixed memory, the TTM object
> > > @@ -785,7 +787,7 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
> > > else
> > > bo->ttm = NULL;
> > > - ttm_bo_unreserve(ghost_obj);
> > > + dma_resv_unlock(&ghost_obj->base._resv);
I guess dropping the lru part here (aside from switching from ->resv to
->_resv, which is your bugfix I think) doesn't matter since the ghost
object got all cleared up and isn't on any lists anyway? Otoh how does it
work then ...
Not clear to me why this is safe.
> > > ttm_bo_put(ghost_obj);
> > > } else if (from->flags & TTM_MEMTYPE_FLAG_FIXED) {
> > > @@ -841,7 +843,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > > if (ret)
> > > return ret;
> > > - ret = dma_resv_copy_fences(ghost->base.resv, bo->base.resv);
> > > + ret = dma_resv_copy_fences(&ghost->base._resv, bo->base.resv);
> > > /* Last resort, wait for the BO to be idle when we are OOM */
> > > if (ret)
> > > ttm_bo_wait(bo, false, false);
> > > @@ -850,7 +852,7 @@ int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo)
> > > bo->mem.mem_type = TTM_PL_SYSTEM;
> > > bo->ttm = NULL;
> > > - ttm_bo_unreserve(ghost);
> > > + dma_resv_unlock(&ghost->base._resv);
> > > ttm_bo_put(ghost);
> > > return 0;
> > > --
> > > 2.17.1
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the Intel-gfx
mailing list