[Intel-gfx] [PATCH] drm/i915: Fix frontbuffer false positve.

Matt Roper matthew.d.roper at intel.com
Tue Feb 24 10:00:39 PST 2015


On Tue, Feb 24, 2015 at 09:32:25AM -0800, Rodrigo Vivi wrote:
> On Mon, Feb 23, 2015 at 6:13 PM, Matt Roper <matthew.d.roper at intel.com> wrote:
> > On Mon, Feb 23, 2015 at 05:52:24PM -0800, Rodrigo Vivi wrote:
> >> Hi Daniel,
> >>
> >> It seems that one check with one good commit followed by many zeroed
> >> intel_crtc->atomic commits is again in place.
> >
> > Can you elaborate on what you mean by this?  With atomic it's possible
> > to have a check with no commit after it (if the check or prepare fail,
> > or if it's a 'test only' operation), but if you're seeing commits
> > without corresponding checks, that sounds like a bug.
> 
> Ah so yes, this is the bug, when moving a cursor we have many commits
> without checks. So it is only one check followed by many commits... so
> it commits with intel_crtc->atomic all zeroed by a previous
> finish_commit.
> 
> >
> > Can you provide a dmesg with drm.debug turned on so we can see what's
> > going on?  Or add some dump_stack()'s so that we can see the backtrace
> > that led us to this situation.
> 
> I lost those logs, but if you put one print in check and one in commit
> you also should be able to see that since I heard about this false
> positive from many people.

Hmm.  Are all of these people using a specific platform?  I only have
access to IVB hardware at the moment, so it could be that the problem
lies on codepaths I just don't exercise on my system.  Most of the
atomic stuff is pretty platform-agnostic, but maybe there's something I
overlooked.

> >
> > Actually, I wonder if the problem is actually the opposite of what you
> > say above.  Now that I look at this again, we only zero out
> > intel_crtc->atomic in intel_finish_crtc_commit which is part of the
> > commit path.
> 
> From what I heard and what I saw on the logs I thought it was ok to
> have 1 check than as many commits as possible without the check.
> If that was the case we would have a fail that is to zero the
> structure on finish_commit. But seems this isn't the case. Sorry.

I guess I should clarify terminology a little bit since this gets kind
of confusing...

For atomic, we have a top-level atomic_state structure that contains
everything that we want to update as part of an atomic transaction
(multiple planes, crtc's, etc.).  At the moment, we only support a
subset of atomic, so you'll only ever get one crtc being modified, but
there could be multiple planes (primary, cursor, 1 or more sprite
planes).  There's a sub-state structure (plane_state, crtc_state, etc.)
for each DRM object being updated.

At the top level of atomic handling we do "check" and then "commit" with
the top-level atomic state.  If you trace down through the codeflow, the
top-level check here will actually call a plane-specific check function
on each plane state during the top-level check, and then the top-level
commit function will call the plane-specific commit function on each
plane involved.

So when I say "1 or more checks, followed by exactly one commit" I'm
referring to the top-level check that deals with the top-level atomic
state.  If you're looking at the plane state specifically, you might see
more of an n:n relationship if your transaction is updating multiple
planes together (one check call per plane being updated, possibly
followed by one commit call for each of them), but the clearing of
intel_crtc->atomic should still happen at the end, after all of the
individual planes have been committed.

> > So if you had a check that never got a corresponding
> > commit, there might still be garbage left over in there.
> 
> No it is the opposite. Commits with no check. causing commit of
> intel_crtc->atomic zeroed.
> 
> > Ultimately we
> > should be handling all of this with real intel_crtc_state handling which
> > we don't quite have yet.  Maybe in the meantime we need to be clearing
> > out intel_crtc->atomic at the beginning of a top-level atomic
> > transaction?  I'll send a patch that makes this change for you to try
> > shortly.
> 
> Thanks! I thought about this but got afraid that clearing this on top
> of check could cause a race since one plane doing a check could clean
> the atomic being commited on cursor movement, unless we hat that for
> planes, but then we would have to iterate over all planes on
> finish_commit.
> 
> We can also put this patch that fix the only known issue so far with a
> FIXME comment while we don't have the final fix.

Well, based on what you say above, it sounds like it probably won't
solve your issue, so we'll need to explore some more.  I still haven't
managed to reproduce this myself, so I'm not sure whether I'm working on
a hardware platform that won't trigger the bug, or whether I'm just
unlucky.

Is there a specific igt test you've been using that triggers this
reliably for you?


Matt

> 
> >
> >
> > Matt
> >
> >>
> >> So I'm getting that annoying false positives on latest -nightly.
> >>
> >> Shouldn't we just merge this patch while atomic modeset design doesn't
> >> get fixed at all?
> >>
> >> Unfortunately nothing comes to my mind than moving all
> >> intel_crtc->atomic set to commit time and let pre_commit just with
> >> pm_get...
> >> Ohter than that just a full redesign of atomic modeset.
> >>
> >> Thanks,
> >> Rodrigo.
> >>
> >>
> >> On Fri, Feb 13, 2015 at 12:48 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> >> > On Thu, Feb 12, 2015 at 05:17:04PM -0800, Rodrigo Vivi wrote:
> >> >> No, we had solved old frontbuffer false positives... some missing
> >> >> flush somewhere at that time...
> >> >>
> >> >> So, I added a bunch of printk and I insist that it is conceptually
> >> >> wrong to set intel_crtc_atomic_commit on check times when you do
> >> >> memset(&intel_crtc->atomic, 0, sizeof(intel_crtc->atomic));
> >> >> on every finish_commit.
> >> >>
> >> >> With exception of atomic.disabled_planes I believe the rest shouldn't
> >> >> work in the way it is implemented because you can have one check
> >> >> followed by many commits, but after the first commit all atomic
> >> >> variables are zeroed, except the disabled_planes that is set outside
> >> >> check...
> >> >
> >> > Ok here's the trouble: Every commit should have at exactly one check for
> >> > the new state objects. Unfortunately in the transition that seems to have
> >> > been lost for some cases.
> >> >
> >> >> For instance: on every cursor movement atomic.fb_bits was 0x000 when
> >> >> it should be 0x002. This is why this patch solved the false positive,
> >> >> i.e. setting it on commit instead on check time we get it propperly
> >> >> set. One of the problems is the false positive but also it breaks
> >> >> entirely SW tracking on VLV/CHV....
> >> >>
> >> >> I believe wait_for flips, update_fbc, watermarks, etc should keep the
> >> >> value got on check for the commit or the check should be done at
> >> >> commit plane instead of on check.
> >> >>
> >> >> I started doing a patch here to move all atomic sets from check to
> >> >> commit functions but gave up on middle when I noticed the
> >> >> prepare_commit would almost get empty...
> >> >
> >> > All state precomputation must be done in check, at commit time you have a
> >> > lot less information since the old state is somewhat gone. You can still
> >> > get at it, but as soon as we add an async flip queue that will get really
> >> > ugly. The current placement is imo the correct one. Instead we need to
> >> > figure out where we're doing a ->commit without properly calling ->check
> >> > beforehand.
> >> >
> >> >> Another idea was to make a atomic set per plane and just memset(0) on
> >> >> begin of every check... But this would require reliable access to the
> >> >> plane being updated on finish_commit... I believe loop on all planes
> >> >> would be messy and cause other issues...
> >> >>
> >> >> So, I'll be out returning only next wed. Please let me know if you
> >> >> have any suggestion of best changes to do that I can implement the
> >> >> changes.
> >> >
> >> > Since you've done this testing I've landed Matt's patches to switch legacy
> >> > plane entry points over to atomic. Which means cursor updates should now
> >> > be done properly using atomic, always. But even then the old transitional
> >> > plane helpers should have called the check functions ... So not sure where
> >> > exactly we're loosing that check call.
> >> >
> >> > Matt Roper might have more insights.
> >> >
> >> > Thanks, Daniel
> >> > --
> >> > Daniel Vetter
> >> > Software Engineer, Intel Corporation
> >> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>
> >>
> >>
> >> --
> >> Rodrigo Vivi
> >> Blog: http://blog.vivi.eng.br
> >
> > --
> > Matt Roper
> > Graphics Software Engineer
> > IoTG Platform Enabling & Development
> > Intel Corporation
> > (916) 356-2795
> 
> 
> 
> -- 
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795


More information about the Intel-gfx mailing list