[PATCH] drm: fix drm_framebuffer cleanup.

Ville Syrjälä ville.syrjala at linux.intel.com
Mon Nov 12 04:28:40 PST 2012


On Sat, Nov 10, 2012 at 10:09:02AM +0900, Inki Dae wrote:
> 2012/11/10 Ville Syrjälä <ville.syrjala at linux.intel.com>
> 
> > On Sat, Nov 10, 2012 at 01:50:53AM +0900, Inki Dae wrote:
> > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > >
> > > > On Fri, Nov 09, 2012 at 11:04:58PM +0900, Inki Dae wrote:
> > > > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > >
> > > > > > On Fri, Nov 09, 2012 at 09:41:19PM +0900, Inki Dae wrote:
> > > > > > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > > > >
> > > > > > > > On Fri, Nov 09, 2012 at 04:39:30PM +0900, Inki Dae wrote:
> > > > > > > > > This patch fixes access issue to invalid memory region.
> > > > > > > > >
> > > > > > > > > crtc had only one drm_framebuffer object so when framebuffer
> > > > > > > > > cleanup was requested after page flip, it'd try to disable
> > > > > > > > > hardware overlay to current crtc.
> > > > > > > > > But if current crtc points to another drm_framebuffer,
> > > > > > > > > This may induce invalid memory access.
> > > > > > > > >
> > > > > > > > > Assume that some application are doing page flip with two
> > > > > > > > > drm_framebuffer objects. At this time, if second
> > drm_framebuffer
> > > > > > > > > is set to current crtc and the cleanup to first
> > drm_framebuffer
> > > > > > > > > is requested once drm release is requested, then first
> > > > > > > > > drm_framebuffer would be cleaned up without disabling
> > > > > > > > > hardware overlay because current crtc points to second
> > > > > > > > > drm_framebuffer. After that, gem buffer to first
> > drm_framebuffer
> > > > > > > > > would be released and this makes dma access invalid memory
> > > > region.
> > > > > > > > >
> > > > > > > > > This patch adds drm_framebuffer to drm_crtc structure as
> > member
> > > > > > > >
> > > > > > > > That is exactly the reverse of what you're doing in the patch.
> > > > > > > > We already have crtc.fb, and you're adding fb.crtc.
> > > > > > > >
> > > > > > > > > and makes drm_framebuffer_cleanup function check if fb->crtc
> > is
> > > > > > > > > same as desired fb. And also when setcrtc and pageflip are
> > > > > > > > > requested, it makes each drm_framebuffer point to current
> > crtc.
> > > > > > > >
> > > > > > > > Looks like you're just setting the crtc.fb and fb.crtc
> > pointers to
> > > > > > > > exactly mirror each other in the page flip ioctl. That can't
> > fix
> > > > > > > > anything.
> > > > > > > >
> > > > > > > >
> > > > > > > At least, when drm is released, the crtc to framebuffer that was
> > > > recently
> > > > > > > used for scanout can be disabled correctly.
> > > > > >
> > > > > > Let's take this scenario:
> > > > > >
> > > > > > setcrtc(crtc0, fb0)
> > > > > >  -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > > > > page_flip(crtc0, fb1)
> > > > > >  -> crtc0.fb = fb1, fb1.crtc = crtc0
> > > > > > rmfb(fb0)
> > > > > >  -> ?
> > > > > >
> > > > > > The rmfb() will now disable crtc0 because fb0.crtc == crtc0, but
> > > > > > let's consider this just a bug and ignore it for now. So let's
> > assume
> > > > > > that crtc0 won't be disabled when we call rmfb(fb0).
> > > > > >
> > > > > >
> > > > > Ok, Please assume that my patch includes the below codes. when we
> > call
> > > > > rmfb(fb0), fb0->crtc is NULL. so fb0 will be freed without disabling
> > > > crtc.
> > > > >
> > > > > int drm_mode_rmfb(struct drm_device *dev, void *data, struct drm_file
> > > > > *file_priv) {
> > > > > ....
> > > > > fb->crtc = NULL;
> > > > > fb->funcs->destroy(fb);
> > > > > out:
> > > > > ...
> > > > > }
> > > >
> > > > As stated above, I already assumed that.
> > > >
> > > > > > The real question is, how would your patch help? You can't free fb0
> > > > > > until the page flip has occured, and your patch doesn't track page
> > > > > > flips, so how can it do anything useful?
> > > > > >
> > > > > > > > First of all multiple crtcs can scan out from the same fb, so a
> > > > single
> > > > > > > > fb.crtc pointer is clearly not enough to represent the
> > relationship
> > > > > > > > between fbs and crtcs.
> > > > > > > >
> > > > > > > > Also you're not clearing the fb.crtc pointer anywhere, so now
> > > > > > > > destroying any framebuffer that was once used for scanout,
> > would
> > > > > > disable
> > > > > > > > some crtc.
> > > > > > > >
> > > > > > > >
> > > > > > > It looks like that you missed my previous comment. Plaese, see
> > the
> > > > > > previous
> > > > > > > comment. And that was already pointed out by Prathyush. fb.crtc
> > will
> > > > be
> > > > > > > cleared at drm_mode_rmfb(). With this, is there my missing
> > point?  I
> > > > > > really
> > > > > > > wonder that with this patch, drm framwork could be faced with any
> > > > > > problem.
> > > > > >
> > > > > > If you clear the fb.crtc pointer before destroying the fb, then you
> > > > > > never disable any crtcs in response to removing a framebuffer.
> > > > > > That's not what we want when the fb is the current fb for the crtc.
> > > > > >
> > > > >
> > > > > Right, there is my missing point. How about this then? Couldn't  we
> > check
> > > > > this fb is last one or not? And if last one, it makes fb->crtc keep
> > as
> > > > is.
> > > >
> > > > That won't help, It would just make the behaviour of your code
> > > > identical to the behaviour of the current code.
> > > >
> > > > > > > > So it looks like you're making things worse, not better.
> > > > > > >
> > > > > > > Anyway I'll just nack the whole idea. What you need to do is
> > track
> > > the
> > > > > > > pending page flips, and make sure the old buffer is not freed too
> > > > > early.
> > > > > > > Just grab a reference to the old gem object when issuing the page
> > > flip,
> > > > > > > and unref it when you're sure the flip has occured. Or you could
> > use
> > > > > the
> > > > > > > new drm_framebuffer refcount, but personally I don't see much
> > point
> > > in
> > > > > > > that when you already have the gem object refcount at your
> > disposal.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > And it seems like that your saying is that each specific drivers
> > > > > > should resolve this issue(access to invalid memory region) tracking
> > > the
> > > > > > pending page flip. But I think that this issue may be a missing
> > point
> > > drm
> > > > > > framework missed so specific drivers is processing this instead.
> > And
> > > with
> > > > > > this patch, couldn't some codes you mentioned be removed from
> > specific
> > > > > > drivers? because it doesn't need to track the pending page flips
> > and
> > > > > > relevant things anymore.
> > > > > >
> > > > > > There may be my missing point. I'd welcome to any comments and
> > > advices.
> > > > >
> > > > > If you don't track the page flips somehow, you can't safely free the
> > > > > previous buffer. It's that simple. You can of course make some of
> > > > > that code generic enough to be shared between drivers. That is
> > > > > actually what the drm_flip mechanism that I wrote for atomic page
> > > > > flips is; a reasonably generic helper for tracking page flips.
> > > > >
> > > > >
> > > > Thanks for comments. Right, now Exynos driver doesn't consider tracking
> > > > page flips yet. This is my TODO. Anyway couldn't we improve this patch
> > so
> > > > that drm framework considers such thing?
> > >
> > > > No, because it depends on hardware specific information. The drm core
> > > > code doesn't have a crystall ball, so it can't just magically know when
> > > > a page flip completes.
> > > >
> > > > > I think with this patch, we could avoid invald memory access issue
> > > > without
> > > > > tracking page flips. In this case, pended page flips would be just
> > > > ignored.
> > > >
> > > > I still don't understand how the patch is supposed to help. If you make
> > > > the proposed change to rmfb() so that it doesn't disable the crtc when
> > > > you remove a non-current fb, then this would happen:
> > > >
> > > > setcrtc(crtc0, fb0)
> > > >  -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > > pageflip(crtc0, fb1);
> > > >  -> crtc0.fb = fb1, fb1.crtc = crtc0
> > > > rmfb(fb0);
> > > >  -> fb0.crtc = NULL;
> > > >     destroy(fb0);
> > > >
> > > > That is exactly the same behaviour we have today, so you still get
> > > > the invalid memory access to fb0 if the page flip hasn't occured yet.
> > > > And that means the fb.crtc pointer is entirely pointless.
> > > >
> > > >
> > > I don't think so. let's see it again. below is my idea AND the reason I
> > > posted this patch.
> > > Original codes,
> > > gem alloc(gem0);
> > > -> gem0 refcount = 1
> > > gem0 mmap
> > > -> gem0 refcount = 2
> > > gem alloc(gem1);
> > > -> gem1 refcount =1
> > > gem1 mmap
> > > -> gem1 refcount =2
> > > addfb(fb0, gem0);
> > > -> gem0 refcount=3
> > > addfb(fb1,gem1);
> > > -> gem1 refcount = 3
> > > setcrtc(fb0, crtc0)
> > > -> crtc0.fb = fb0
> > > pageflip(crtc0, fb1);
> > > -> crtc0.fb = fb1.
> > > and pageflip is repeated
> > >
> > > close(gem0)
> > > -> gem0 refcount = 2
> > > close(gem1)
> > > ->gem1 refcount = 2
> > > munmap(gem0)
> > > ->gem0 refcount = 1
> > > munmap(gem1)
> > > ->gem1 refcount = 1
> > >
> > > close(drm)
> > > 1. fb release
> > > -> check if crtc->fb is same as fb0 but current crtc is pointing to fb1
> > > 2. so free fb0 without disabling current crtc.
> > > -> gem0 refcount = 0 so released. At this time, dma access invalid memory
> > > region unfortunately* *if the dma is accessing gem0.
> > >
> > >
> > >
> > > With my patch,
> > > ...
> > > setcrtc(fb0, crtc0)
> > > -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > pageflip(crtc0, fb1);
> > > -> crtc0.fb = fb1, fb1.crtc = crtc0.
> > > and pageflip is repeated
> > >
> > > close(gem0)
> > > -> gem0 refcount = 2
> > > close(gem1)
> > > ->gem1 refcount = 2
> > > munmap(gem0)
> > > ->gem0 refcount = 1
> > > munmap(gem1)
> > > ->gem1 refcount = 1
> > > close(drm)
> > > 1. fb release
> > > -> fb0->crtc is same as crtc so disable crtc (dma stop)
> >
> > No, that's wrong. The current fb is fb1, so destroying fb0 must not
> > disable the crtc.
> >
> >
> Do you think this is wrong that when fb0 is destroyed, the crtc also is
> disabled and crtc is pointing to fb1?

Yes. You must disable the crtc if and only if you destroy the current
fb. Current fb means the fb specified by the latest setcrtc or pageflip
ioctl. Destroying any other fb must not affect the crtc. And this is
exactly what the current code does.

> And in case of Exynos driver,
> disabling the crtc would disable hardware overlay so other overlays would
> be holded on as is. This is just Exynos case so I don't know how other SoCs
> are operated. Could you explain how Desktop side is operated? Maybe there
> is my missing point here.

I'm not sure what you're asking.

Are you asking how other drivers know know when it's safe to free the
memory? Well, the usual method is to track the page flips via some
suitable interrupts. A reference is kept to the memory until it's safe
to get rid of it.

Or are you asking what should happen when a crtc is disabled? Then my
answer is that when a crtc is disabled, there must be no output signal
whatsoever.

It's an unfortunate design bug that we even have the fb->crtc link.
Ideally it would not exist, and all scanout duties would be handled
by planes. Then destroying fbs would never disable the crtc completely,
just the planes scanning from those fbs.

-- 
Ville Syrjälä
Intel OTC


More information about the dri-devel mailing list