[Intel-gfx] i915 irq storm mitigation in 3.10

Egbert Eich eich at suse.com
Mon Jul 22 10:04:09 CEST 2013


Daniel Vetter writes:
 > On Sun, Jul 21, 2013 at 10:23 PM, Jan Niggemann <jn at hz6.de> wrote:
 > >> But every time this happens we only let through a few interrupts, so this
 > >> shouldn't affect you badly. Can you please check whether those slowdowns
 > >> line up with 2 minute intervalls?
 > >
 > > I observed these slowdowns for a couple of weeks now. On my machine, they
 > > only happen once, some minutes after a cold boot.
 > > They last for a minute or two, and then they are gone.
 > > I'd have guessed that the storm detection kicks in pretty quickly after a
 > > storm is detected and that it would go unnoticed.
 > 
 > Hm, that sounds like something doesn't quite work as expected. We
 > should kill things once we get 5 interrupts or so in 1 second. So if
 > it's bad enough that it slows your machine down it really should only
 > be barely noticeable.
 > 

The logs show that the disable mechanism got triggered, so there was
a storm that got detected.
The respective message is generated by the worker, everything up to 
there (detection and marking disabled) seems to be fine.
I bet we are still getting interrupts but the respective bit in 
hpd_event_bits doesn't get set any more. Since we unconditionally 
queue the worker on interrupt there is surprise it is so busy.

Then this points to the call to hpd_irq_setup() in intel_hpd_irq_handler()
not doing what is expected, ie masking out the stormy interrupt.
Could it be that we can't mask/disable an interrupt before ACKing
it?

@Jan, could you also specify what hardware you are using (ie give us
an output of lspci -n)?


Cheers,
	Egbert.



More information about the Intel-gfx mailing list