[PATCH] drm: fix drm_framebuffer cleanup.

Inki Dae inki.dae at samsung.com
Mon Nov 12 06:36:04 PST 2012


2012/11/12 Ville Syrjälä <ville.syrjala at linux.intel.com>

> On Sat, Nov 10, 2012 at 10:09:02AM +0900, Inki Dae wrote:
> > 2012/11/10 Ville Syrjälä <ville.syrjala at linux.intel.com>
> >
> > > On Sat, Nov 10, 2012 at 01:50:53AM +0900, Inki Dae wrote:
> > > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > >
> > > > > On Fri, Nov 09, 2012 at 11:04:58PM +0900, Inki Dae wrote:
> > > > > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > > >
> > > > > > > On Fri, Nov 09, 2012 at 09:41:19PM +0900, Inki Dae wrote:
> > > > > > > > 2012/11/9 Ville Syrjälä <ville.syrjala at linux.intel.com>
> > > > > > > >
> > > > > > > > > On Fri, Nov 09, 2012 at 04:39:30PM +0900, Inki Dae wrote:
> > > > > > > > > > This patch fixes access issue to invalid memory region.
> > > > > > > > > >
> > > > > > > > > > crtc had only one drm_framebuffer object so when
> framebuffer
> > > > > > > > > > cleanup was requested after page flip, it'd try to
> disable
> > > > > > > > > > hardware overlay to current crtc.
> > > > > > > > > > But if current crtc points to another drm_framebuffer,
> > > > > > > > > > This may induce invalid memory access.
> > > > > > > > > >
> > > > > > > > > > Assume that some application are doing page flip with two
> > > > > > > > > > drm_framebuffer objects. At this time, if second
> > > drm_framebuffer
> > > > > > > > > > is set to current crtc and the cleanup to first
> > > drm_framebuffer
> > > > > > > > > > is requested once drm release is requested, then first
> > > > > > > > > > drm_framebuffer would be cleaned up without disabling
> > > > > > > > > > hardware overlay because current crtc points to second
> > > > > > > > > > drm_framebuffer. After that, gem buffer to first
> > > drm_framebuffer
> > > > > > > > > > would be released and this makes dma access invalid
> memory
> > > > > region.
> > > > > > > > > >
> > > > > > > > > > This patch adds drm_framebuffer to drm_crtc structure as
> > > member
> > > > > > > > >
> > > > > > > > > That is exactly the reverse of what you're doing in the
> patch.
> > > > > > > > > We already have crtc.fb, and you're adding fb.crtc.
> > > > > > > > >
> > > > > > > > > > and makes drm_framebuffer_cleanup function check if
> fb->crtc
> > > is
> > > > > > > > > > same as desired fb. And also when setcrtc and pageflip
> are
> > > > > > > > > > requested, it makes each drm_framebuffer point to current
> > > crtc.
> > > > > > > > >
> > > > > > > > > Looks like you're just setting the crtc.fb and fb.crtc
> > > pointers to
> > > > > > > > > exactly mirror each other in the page flip ioctl. That
> can't
> > > fix
> > > > > > > > > anything.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > At least, when drm is released, the crtc to framebuffer that
> was
> > > > > recently
> > > > > > > > used for scanout can be disabled correctly.
> > > > > > >
> > > > > > > Let's take this scenario:
> > > > > > >
> > > > > > > setcrtc(crtc0, fb0)
> > > > > > >  -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > > > > > page_flip(crtc0, fb1)
> > > > > > >  -> crtc0.fb = fb1, fb1.crtc = crtc0
> > > > > > > rmfb(fb0)
> > > > > > >  -> ?
> > > > > > >
> > > > > > > The rmfb() will now disable crtc0 because fb0.crtc == crtc0,
> but
> > > > > > > let's consider this just a bug and ignore it for now. So let's
> > > assume
> > > > > > > that crtc0 won't be disabled when we call rmfb(fb0).
> > > > > > >
> > > > > > >
> > > > > > Ok, Please assume that my patch includes the below codes. when we
> > > call
> > > > > > rmfb(fb0), fb0->crtc is NULL. so fb0 will be freed without
> disabling
> > > > > crtc.
> > > > > >
> > > > > > int drm_mode_rmfb(struct drm_device *dev, void *data, struct
> drm_file
> > > > > > *file_priv) {
> > > > > > ....
> > > > > > fb->crtc = NULL;
> > > > > > fb->funcs->destroy(fb);
> > > > > > out:
> > > > > > ...
> > > > > > }
> > > > >
> > > > > As stated above, I already assumed that.
> > > > >
> > > > > > > The real question is, how would your patch help? You can't
> free fb0
> > > > > > > until the page flip has occured, and your patch doesn't track
> page
> > > > > > > flips, so how can it do anything useful?
> > > > > > >
> > > > > > > > > First of all multiple crtcs can scan out from the same fb,
> so a
> > > > > single
> > > > > > > > > fb.crtc pointer is clearly not enough to represent the
> > > relationship
> > > > > > > > > between fbs and crtcs.
> > > > > > > > >
> > > > > > > > > Also you're not clearing the fb.crtc pointer anywhere, so
> now
> > > > > > > > > destroying any framebuffer that was once used for scanout,
> > > would
> > > > > > > disable
> > > > > > > > > some crtc.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > It looks like that you missed my previous comment. Plaese,
> see
> > > the
> > > > > > > previous
> > > > > > > > comment. And that was already pointed out by Prathyush.
> fb.crtc
> > > will
> > > > > be
> > > > > > > > cleared at drm_mode_rmfb(). With this, is there my missing
> > > point?  I
> > > > > > > really
> > > > > > > > wonder that with this patch, drm framwork could be faced
> with any
> > > > > > > problem.
> > > > > > >
> > > > > > > If you clear the fb.crtc pointer before destroying the fb,
> then you
> > > > > > > never disable any crtcs in response to removing a framebuffer.
> > > > > > > That's not what we want when the fb is the current fb for the
> crtc.
> > > > > > >
> > > > > >
> > > > > > Right, there is my missing point. How about this then? Couldn't
>  we
> > > check
> > > > > > this fb is last one or not? And if last one, it makes fb->crtc
> keep
> > > as
> > > > > is.
> > > > >
> > > > > That won't help, It would just make the behaviour of your code
> > > > > identical to the behaviour of the current code.
> > > > >
> > > > > > > > > So it looks like you're making things worse, not better.
> > > > > > > >
> > > > > > > > Anyway I'll just nack the whole idea. What you need to do is
> > > track
> > > > the
> > > > > > > > pending page flips, and make sure the old buffer is not
> freed too
> > > > > > early.
> > > > > > > > Just grab a reference to the old gem object when issuing the
> page
> > > > flip,
> > > > > > > > and unref it when you're sure the flip has occured. Or you
> could
> > > use
> > > > > > the
> > > > > > > > new drm_framebuffer refcount, but personally I don't see much
> > > point
> > > > in
> > > > > > > > that when you already have the gem object refcount at your
> > > disposal.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > And it seems like that your saying is that each specific
> drivers
> > > > > > > should resolve this issue(access to invalid memory region)
> tracking
> > > > the
> > > > > > > pending page flip. But I think that this issue may be a missing
> > > point
> > > > drm
> > > > > > > framework missed so specific drivers is processing this
> instead.
> > > And
> > > > with
> > > > > > > this patch, couldn't some codes you mentioned be removed from
> > > specific
> > > > > > > drivers? because it doesn't need to track the pending page
> flips
> > > and
> > > > > > > relevant things anymore.
> > > > > > >
> > > > > > > There may be my missing point. I'd welcome to any comments and
> > > > advices.
> > > > > >
> > > > > > If you don't track the page flips somehow, you can't safely free
> the
> > > > > > previous buffer. It's that simple. You can of course make some of
> > > > > > that code generic enough to be shared between drivers. That is
> > > > > > actually what the drm_flip mechanism that I wrote for atomic page
> > > > > > flips is; a reasonably generic helper for tracking page flips.
> > > > > >
> > > > > >
> > > > > Thanks for comments. Right, now Exynos driver doesn't consider
> tracking
> > > > > page flips yet. This is my TODO. Anyway couldn't we improve this
> patch
> > > so
> > > > > that drm framework considers such thing?
> > > >
> > > > > No, because it depends on hardware specific information. The drm
> core
> > > > > code doesn't have a crystall ball, so it can't just magically know
> when
> > > > > a page flip completes.
> > > > >
> > > > > > I think with this patch, we could avoid invald memory access
> issue
> > > > > without
> > > > > > tracking page flips. In this case, pended page flips would be
> just
> > > > > ignored.
> > > > >
> > > > > I still don't understand how the patch is supposed to help. If you
> make
> > > > > the proposed change to rmfb() so that it doesn't disable the crtc
> when
> > > > > you remove a non-current fb, then this would happen:
> > > > >
> > > > > setcrtc(crtc0, fb0)
> > > > >  -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > > > pageflip(crtc0, fb1);
> > > > >  -> crtc0.fb = fb1, fb1.crtc = crtc0
> > > > > rmfb(fb0);
> > > > >  -> fb0.crtc = NULL;
> > > > >     destroy(fb0);
> > > > >
> > > > > That is exactly the same behaviour we have today, so you still get
> > > > > the invalid memory access to fb0 if the page flip hasn't occured
> yet.
> > > > > And that means the fb.crtc pointer is entirely pointless.
> > > > >
> > > > >
> > > > I don't think so. let's see it again. below is my idea AND the
> reason I
> > > > posted this patch.
> > > > Original codes,
> > > > gem alloc(gem0);
> > > > -> gem0 refcount = 1
> > > > gem0 mmap
> > > > -> gem0 refcount = 2
> > > > gem alloc(gem1);
> > > > -> gem1 refcount =1
> > > > gem1 mmap
> > > > -> gem1 refcount =2
> > > > addfb(fb0, gem0);
> > > > -> gem0 refcount=3
> > > > addfb(fb1,gem1);
> > > > -> gem1 refcount = 3
> > > > setcrtc(fb0, crtc0)
> > > > -> crtc0.fb = fb0
> > > > pageflip(crtc0, fb1);
> > > > -> crtc0.fb = fb1.
> > > > and pageflip is repeated
> > > >
> > > > close(gem0)
> > > > -> gem0 refcount = 2
> > > > close(gem1)
> > > > ->gem1 refcount = 2
> > > > munmap(gem0)
> > > > ->gem0 refcount = 1
> > > > munmap(gem1)
> > > > ->gem1 refcount = 1
> > > >
> > > > close(drm)
> > > > 1. fb release
> > > > -> check if crtc->fb is same as fb0 but current crtc is pointing to
> fb1
> > > > 2. so free fb0 without disabling current crtc.
> > > > -> gem0 refcount = 0 so released. At this time, dma access invalid
> memory
> > > > region unfortunately* *if the dma is accessing gem0.
> > > >
> > > >
> > > >
> > > > With my patch,
> > > > ...
> > > > setcrtc(fb0, crtc0)
> > > > -> crtc0.fb = fb0, fb0.crtc = crtc0
> > > > pageflip(crtc0, fb1);
> > > > -> crtc0.fb = fb1, fb1.crtc = crtc0.
> > > > and pageflip is repeated
> > > >
> > > > close(gem0)
> > > > -> gem0 refcount = 2
> > > > close(gem1)
> > > > ->gem1 refcount = 2
> > > > munmap(gem0)
> > > > ->gem0 refcount = 1
> > > > munmap(gem1)
> > > > ->gem1 refcount = 1
> > > > close(drm)
> > > > 1. fb release
> > > > -> fb0->crtc is same as crtc so disable crtc (dma stop)
> > >
> > > No, that's wrong. The current fb is fb1, so destroying fb0 must not
> > > disable the crtc.
> > >
> > >
> > Do you think this is wrong that when fb0 is destroyed, the crtc also is
> > disabled and crtc is pointing to fb1?
>
> Yes. You must disable the crtc if and only if you destroy the current
> fb. Current fb means the fb specified by the latest setcrtc or pageflip
> ioctl. Destroying any other fb must not affect the crtc. And this is
> exactly what the current code does.
>
>
Right, I realized that there was my missing point though your comment. I
thought that current fb would be set to crtc->fb only if setmode request.
But page flip request also did it(by specific driver's page flip
function). Now Exynos driver has one problem. that is to set current fb to
crtc->fb after drm_vblank_get call . This could induce invalid memory
access like below,

crtc0's current fb is fb0

page flip request(crtc0, fb1)
1. drm_vblank_get call
2. vsync occurs and the dma access memory region to fb0 yet.
3. crtc0->fb = fb1
3. drm is released
4. crtc's current fb is fb1 but the dma is accessing memory region to fb0
yet because vsync doesn't occur so fb0 doesn't disable crtc and releses its
own gem buffer. This would induce the problem and definitely is a bug.

For this, I will fix it as soon as possible after testing.



> > And in case of Exynos driver,
> > disabling the crtc would disable hardware overlay so other overlays would
> > be holded on as is. This is just Exynos case so I don't know how other
> SoCs
> > are operated. Could you explain how Desktop side is operated? Maybe there
> > is my missing point here.
>
> I'm not sure what you're asking.
>
>

> Are you asking how other drivers know know when it's safe to free the
> memory? Well, the usual method is to track the page flips via some
> suitable interrupts. A reference is kept to the memory until it's safe
> to get rid of it.
>
> Or are you asking what should happen when a crtc is disabled? Then my
> answer is that when a crtc is disabled, there must be no output signal
> whatsoever.
>
> It's an unfortunate design bug that we even have the fb->crtc link.
> Ideally it would not exist, and all scanout duties would be handled
> by planes. Then destroying fbs would never disable the crtc completely,
> just the planes scanning from those fbs.
>
>
Understood. And I'd like to say thank you again. Your comment is very
helpful to me.

Thanks,
Inki Dae


> --
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20121112/97b898d4/attachment-0001.html>


More information about the dri-devel mailing list