[Intel-gfx] X hang with quirk VT switches

Chris Wilson chris at chris-wilson.co.uk
Thu Dec 4 03:21:47 PST 2014


On Thu, Dec 04, 2014 at 11:53:05AM +0100, Takashi Iwai wrote:
> At Wed, 3 Dec 2014 18:31:45 +0000,
> Chris Wilson wrote:
> > 
> > On Wed, Dec 03, 2014 at 03:45:35PM +0100, Takashi Iwai wrote:
> > > Hi,
> > > 
> > > while checking the reported bug about VT switch hang on openSUSE 13.2,
> > > I also could reproduce a similar issue as reported: namely, X hangs
> > > when repeatedly switching VT quickly.
> > > 
> > > For example, running the following on KDE results in the stall of X.
> > > 
> > > 	% for i in $(seq 1 100); do chvt 1; chvt 7; done
> > > 
> > > Looking at the sysrq-t output, it stalls at drm_read().  And after
> > > putting some debug prints at event handling codes, it shows like:
> > > 
> > >  drm_queue_vblank_event event_space=4064
> > >  send_vblank_event event_space=4064
> > >  drm_poll ENTER event_space=4064
> > >  drm_poll mask=0x41 event_space=4064
> > >  drm_poll ENTER event_space=4064
> > >  drm_poll mask=0x41 event_space=4064
> > >  drm_read ENTER event_space=4064
> > >  drm_read total=32 event_space=4096
> > >  drm_poll ENTER event_space=4096
> > >  drm_poll mask=0x0 event_space=4096
> > >  drm_read ENTER event_space=4096
> > >  drm_read ENTER event_space=4096
> > >  drm_read ENTER event_space=4096
> > > 
> > > So, after a vblank event, two poll calls succeeded, followed by one
> > > drm_read().  After that, there were one poll call without event,
> > > followed by three(!) drm_read() calls.  The last three drm_read()
> > > never exited, thus X stalled.  So, this looks like a race or a
> > > refcount issue somewhere.
> > 
> > The key question is how did you get 3 calls to drm_read that each didn't
> > return? The only place where we call drm_read without first doing a poll
> > is in the WakeupHandler with the drm fd flagged for reads. This is
> > broken in ZaphodHeads as the drm fd is not O_NONBLOCK without
> > 
> > commit bd008e5b2953186fc0c6633a885ade95e7043800
> > Author: Chris Wilson <chris at chris-wilson.co.uk>
> > Date:   Tue Oct 7 14:13:51 2014 +0100
> > 
> >     drm: Implement O_NONBLOCK support on /dev/dri/cardN
> > 
> > I assume that isn't the case as I expect you would have mentioned using
> > ZaphodHeads.
> 
> I took a look back at drm_read() code again, and I found that the
> function doesn't care about O_NONBLOCK at all.  (And there is a memory
> leak, too.)
> 
> So I added the support for O_NONBLOCK, and the problem seems
> resolved.
> 
> Although this is no right "fix" (the caller side should be fixed), it
> would be good to have in anyway.  I'm going to send patches for review
> to dri-devel ML, as it's no i915 specific.

I disagree. drm has claimed to support O_NONBLOCK since its inception,
but the implementation was buggy. However, I don't think there is a case
in non-ZaphodHeads where we use read() without first select/poll
reporting that there is something to use (and the problem with
ZaphodHeads is that we have two screens that share the same drm fd
without clearing the select read flags... hmm)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list