[Intel-gfx] [PATCH] drm/i915: Flush outstanding unpin tasks before pageflipping
Jesse Barnes
jbarnes at virtuousgeek.org
Thu Nov 1 17:58:51 CET 2012
On Thu, 01 Nov 2012 16:52:05 +0000
Tvrtko Ursulin <tvrtko.ursulin at onelan.co.uk> wrote:
> On Thursday 01 November 2012 16:20:03 Chris Wilson wrote:
> > On Thu, 1 Nov 2012 09:04:02 -0700, Jesse Barnes <jbarnes at virtuousgeek.org>
> wrote:
> > > On Thu, 01 Nov 2012 15:52:23 +0000
> > >
> > > Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > > > Actually I've justified the blocking here to myself, and prefer it to
> > > > simply running the crtc->unpin_work. If userspace is swamping the system
> > > > so badly that we can run the kthreads quick enough, it deserves a stall.
> > > > Note that the unpin leak is still about the 3rd most common bug in
> > > > fedora,
> > > > so this stall will be forced on many machines.
> > >
> > > Hm funky, why does Fedora hit it so much? Does some of the GNOME shell
> > > stuff run unthrottled or something?
> >
> > I don't think so. I trust that in Tvrtko's use case, he is not so much as
> > hogging the GPU as keeping the system as a whole relatively busy. So I
> > suspect it is more to do with CPU starvation of the kthreads than
> > anything else.
> >
> > Tvrtko, do you have any feeling for why your machine was easily
> > suspectible to this leak? Are the stalls noticeable and do they affect
> > your performance targets?
>
> We didn't bother looking for any stalls, but for a long time we were
> occasionally hitting this pin_count BUG i915_gem_object_pin. So it didn't in
> fact affect our performance targets as much it completely wrecked our system.
>
> If this patch causes an occasional stall instead, given that this bug triggers
> every 3-4 hours of uptime, we are fine with that. If a frame or so is missed
> every couple hours on low end hardware we don't care that much.
>
> More on the actual workload...
>
> Only recently we got lucky and found a platform and workload where it happens
> reliably. And this patch reliably fixes that.
>
> In this workload CPU is being loaded 50-60% decoding a movie and rendering it
> to a full screen window. Our proprietary compositor page flips at 60Hz only,
> not faster. Together with another small semi-transparent window being rendered
> on top of the full screen movie. Movie played is a 25fps one, which means the
> full screen window is damaged 25 out of 60 frames (give or take) which is when
> we render to our back buffer and page flip at the vsync rate (60Hz).
>
> According to intel_gpu_top tool, GPU load is roughly at 40%, apart from the
> "Framebuffer Compression" metric which is maxed out, if that is one is at all
> valid.
>
> This particular scenario triggers the bug only on two of our Atom based
> platform both with a NM10/Pineview G/i915 chipset.
Ah ok on Atom you're probably CPU constrained a bit, but still at
50-60% utilization the kthreads should be running at least sometimes...
But it sounds like a case of the kthreads not running instead of
queueing too fast anyway (not that the latter is really possible
without some hacking to the flip code).
--
Jesse Barnes, Intel Open Source Technology Center
More information about the Intel-gfx
mailing list