[PATCH 1/5] drm/vmwgfx: unbind in vmw_ttm_unpopulate

Daniel Vetter daniel at ffwll.ch
Fri Jul 23 09:21:56 UTC 2021


On Fri, Jul 23, 2021 at 11:13 AM Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> Am 23.07.21 um 10:47 schrieb Daniel Vetter:
> > On Thu, Jul 22, 2021 at 02:41:23PM +0200, Christian König wrote:
> >> Doing this in vmw_ttm_destroy() is to late.
> >>
> >> It turned out that this is not a good idea at all because it leaves pointers
> >> to freed up system memory pages in the GART tables of the drivers.
> > So I wanted to review this series, and I can't reconcile your claim here
> > with the demidlayering Dave has done. The driver patches here don't
> > ouright undo what Dave has done, but that means the bug has been
> > preexisting since forever (or is due to some other change?), and your
> > commit message is a bit confusing here.
> >
> > The final patch just undoes the demidlayering from Dave, and I really
> > don't see where there's even a functional change there.
> >
> > And even these patches here don't really change a hole lot with the
> > calling sequence for at least final teardown: ttm_tt_destroy_common calls
> > ttm_tt_unpopulate as the first thing, so at least there there's no change.
> >
> > Can you pls elaborate more clearly what exactly you're fixing and what
> > exactly needs to be reordered and where this bug is from (commit sha1)? As
> > is I'm playing detective and the evidence presented is extremely since and
> > I can't reconcile it at all.
> >
> > I mean I know you don't like typing commit message and documentation, but
> > it does get occasionally rather frustrating on the reviewer side if I have
> > to interpolate between some very sparse hints for this stuff :-/
>
> Yeah, when have seen the history it's rather obvious what's wrong here
> and I expected Dave to review it himself.
>
> Previously we had three states in TTM for a tt object: Allocated ->
> Populated -> Bound which on destruction where done in the order unbind
> -> unpopulate -> free.
>
> Dave moved handling of the bound state into the drivers since it is
> basically a driver decision and not a TTM decision what should be bound
> and what not (that part perfectly makes sense).

I haven't reviewed all the patches from Dave, only the one you pointed
at (in the last patch). And that one I still can't match up with your
description. If there's other commits relevant, can you pls dig them
out?

Like it all makes sense what you're saying and matches the code, I
just can't match it up with the commit you're referencing.

> The problem is that he also moved doing the unbind into the free
> callback instead of the unpopulate callback. This result in stale page
> pointers in the GART if that unpopulate operation isn't immediately
> followed by a free.
>
> Thinking more about it if we do populated->unpopulated->populated then
> we would also have stale pointers to the old pages which is even worse.
>
> This is also not de-midlayering since we already have a proper
> ttm_tt_init()/ttm_tt_fini() functions which should work nicely for the
> tt object.

Well you're last patch moves the ttm_tt_destroy_common stuff back into
ttm, which kinda is de-demidlayering. So I'm confused.

Other bit: I think it'd be good to document this properly in the
callbacks, and maybe ideally go about and kerneldoc-ify the entire
ttm_tt.h header. Otherwise when we eventually (never?) get around to
that, everyone has forgotten these semantic details and issues again.
-Daniel

> Christian.
>
> > -Daniel
> >
> >> Signed-off-by: Christian König <christian.koenig at amd.com>
> >> ---
> >>   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 9 +++------
> >>   1 file changed, 3 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
> >> index b0973c27e774..904031d03dbe 100644
> >> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
> >> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
> >> @@ -526,14 +526,9 @@ static void vmw_ttm_destroy(struct ttm_device *bdev, struct ttm_tt *ttm)
> >>      struct vmw_ttm_tt *vmw_be =
> >>              container_of(ttm, struct vmw_ttm_tt, dma_ttm);
> >>
> >> -    vmw_ttm_unbind(bdev, ttm);
> >>      ttm_tt_destroy_common(bdev, ttm);
> >>      vmw_ttm_unmap_dma(vmw_be);
> >> -    if (vmw_be->dev_priv->map_mode == vmw_dma_alloc_coherent)
> >> -            ttm_tt_fini(&vmw_be->dma_ttm);
> >> -    else
> >> -            ttm_tt_fini(ttm);
> >> -
> >> +    ttm_tt_fini(ttm);
> >>      if (vmw_be->mob)
> >>              vmw_mob_destroy(vmw_be->mob);
> >>
> >> @@ -578,6 +573,8 @@ static void vmw_ttm_unpopulate(struct ttm_device *bdev,
> >>                                               dma_ttm);
> >>      unsigned int i;
> >>
> >> +    vmw_ttm_unbind(bdev, ttm);
> >> +
> >>      if (vmw_tt->mob) {
> >>              vmw_mob_destroy(vmw_tt->mob);
> >>              vmw_tt->mob = NULL;
> >> --
> >> 2.25.1
> >>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list