[Intel-gfx] [PATCH 0/8] Detect and deal with Interrupt 'Storms' from noisy Hotplug Lines.

Egbert Eich eich at freedesktop.org
Tue Jan 22 14:22:34 CET 2013


Hi Daniel,

I've played around a bit now, and implemented your suggestions:

On Thu, Jan 17, 2013 at 03:45:26PM +0100, Daniel Vetter wrote:
> On Thu, Jan 17, 2013 at 03:01:06PM +0100, Egbert Eich wrote:
> > Hi Daniel,
> > 
> > On Fri, Jan 11, 2013 at 09:34:08PM +0100, Daniel Vetter wrote:
> > > 
> > > Nice work, and we know that we need this since quite a while. But
> > > unfortunately we've not yet come around to implement something. Some
> > > high-level comments on how I think this should best be handled:
> > > 
> > > - imo dv_priv->hotplug_supported_mask should die - it leaks platform
> > >   specific irq magic from i915_irq.c into every connector/encoder. And we
> > >   have had the bugs and confusions to prove that it's not a good idea. I
> > >   think it'd be better if we add a new HOTPLUG_PIN_FOO enum that encoders
> > >   register interest in, and the platform code in i915_irq.c then maps
> > >   from/to that. On a quick check we have hotplug pins for CRT, TV,
> > >   SDVO_B&C and PORT_A-D (for DP&HDMI).
> > 
> > I thought along the same lines, I just didn't want to go quite as far.
> > Therefore I added functions in i915_irq.c to set these depending on the
> > connector.
> > 
> > > 
> > >   Also note that on PCH_SPLIT platforms port A is not in the same
> > >   register, further platforms will make an even cuter mess of this ...
> > 
> > Ok, I will look into that.
> > 

So far I have not seen anywhere where there's hotplug support for PORT_A.
PORT_A is marked as an internal connector without any HPD.

> > > 
> > > - I think the the hpd pin should be track in the encoder, not in the
> > >   connector. The only encoders where there's not a 1:1 relationship (sdvo
> > >   and ddi on hsw) want it there. Also, we already have the ->hot_plug
> > >   callback in the encoder, which will be useful for later extensions.

For SDVO it is definitely much simpler to track this in the encoder.
However in the IRQ handling code we always need to take the detour thru
the connector as the drm code expects any hotplog related information in
the connector.
To get the connector we always have to walk thru the connector list and
obtain the associated encoder. Walking thru the encoder list isn't 
sufficient as there is no easy way from encoder to connector.
I don't have a strong preference either way - in the code I'm currently 
playing around with I keep this information in struct intel_encoder.

> > > 
> > > - Since some encoders share the same hpd pin (HDMI&DP on pre-hsw) I think
> > >   we should keep the noise statistic data in the device's dev_priv
> > >   somewhere in an array, with one set for each hpd pin from the enum above.
> > 
> > This would also be an option. I did notice that these pins are shared, it
> > didn't cause any issues as always both connectors got flagged simultaniously.
> > On the other hand calling the same disable/enable twice when traversing the
> > connector list is sorta ugly.
> 
> Yeah, I mostly want to have a clear 1:1 relationship between interrupt
> lines and the statistics about the noise on them ...

Yup. That's easy to do. In fact in my original prove of concept implementation
I had it that way. However I was not sure if it's appreciated to add such 
bookkeeping in such a global fashion.

> 
> > > - In 3.8 the drm hpd/polling helpers are much improved and don't randomly
> > >   poll everything any more. So if a hpd connector isn't marked as
> > >   OUTPUT_POLL, it wont ever get polled. Which means if you disable the hpd
> > >   irq for it, we need to have our own poll work to do that for us. The
> > >   long-term goal I have is to pimp the encoder->hot_plug callback also for
> > >   this case, to avoid re-running the connector detect code on unrelated
> > >   outputs (which can sometimes cause havoc).
> > 
> > I do change the state of the 'polled' member when I disable/reenable hotplug
> > interrupts already. This part therefore should work fine already.
> 
> Hm, I've missed that, despite looking for it in the patches. One thing to
> note is that the poll work will disable itself if there's no connector
> with one of the POLL flags set in 3.8, so I think you need to kick it
> again when polling. Another thing to keep in mind is that we have encoders

Exactly. This is something I had missed. It's however easily fixed by
calling drm_kms_helper_poll_enable() when changing the settings.

> with POLL and HDP connectors (sometimes on the same one) - SDVO is the
> prime example since polling seems to work, but not too reliably. Hence we
> need the polling as a backup. To correctly restore those flags I guess we
> need a saved_polled variable in intel_connector which we need to restore
> when enabling the the hpd line again.

I don't see this in the code (drm-intel-testing pulled last Friday).
On any connector it is either the DRM_CONNECTOR_POLL_HPD or the 
DRM_CONNECTOR_POLL_CONNECT (mostly with the DRM_CONNECTOR_POLL_DISCONNECT flag)
set but not both.
Of course it could be done like you suggest, ie. continue polling despite
waiting for interrupts, but this begs the question if we should not resort
to polling entirely: the only benefit of doing HDP would be that we would
get informed about an output change more quickly.

> 
> > >   Eventually a want a hpd interrupt to only run the ->hot_plug callbacks
> > >   on encoders which are interested in that signal, hence this slight
> > >   overkill ... Ofc, that requires that we move a lot of the ->detect logic
> > 
> > This was exactly my question: we have all information at hand now to do this
> > and I can easily add this. The downside is that if the information about
> > the mapping is not accurate (ie if a vendor routes HPD lines differently)
> > this connector will never light up :(
> > As it is now since we poll everything when an interrupt happened we can
> > be sure that we catch all connectors even if the mapping in our tables
> > don't reflect what's wired on the board.
> > I didn't just go ahead and implement this yet as I've gotten too pessimistic.
> 
> I share your pessimism, and we certainly need tons of special cases to
> make this work. E.g. the sdvo case, but also DP->VGA dongles where the
> forwarded hotplug events are as unreliable as plain VGA. And that exercise
> in only calling the right hpd handlers is only really useful if we cache
> the EDID, since userspace will do a full scan after it receives the
> hotplug event anyway. My idea is that the ->hotplug callback will then
> only invalidate the edid and we still do a full scan through all
> connector->detect callbacks. But for those outputs with reliably hdp we
> won't touch the hw (and so also optimize away the delays when userspace
> does the same afterwards). Once a hpd storm is detect and we switch to
> polling, we'd need to mark that output as unreliable to disable all edid
> caching.
> 
> We could even try to cache the edid for unreliable outputs like VGA for a
> short time ...
> 
> > >   into ->hot_plug, but that's the only way to do sane EDID cache and
> > >   similar things on outputs where hpd should work (DP/HDMI).
> > 
> > ... But since you suggest this I will gladly add this :)
> 
> Imo better in a follow-up series, since there's quite some prep work
> involved. And I also think that it makes more sense to implement EDID
> caching first (which in turn requires some code to detect hpd irq storms
> ...).

I had sent a patch for EDID caching on the DRM level to Dave last December. 
I received some comments and suggestions from Ville from Intel which I had 
worked in - however I have not seen any reaction from Dave, yet.

Maybe you want to take a look at this. It cannot cache in all situations
where caching would be useful and possible. It still should do a fairly good
job of caching EDID extension blocks. This is because it currently 
lacks any driver interface and thus only can do as much as you can do
without a deeper knowledge of what's going on on the hardware level.

On the other hand I believe I can add selective probing quite easily.
> 
> > > - The math buff in me would like hpd stroms to gracefully degrade into
> > >   polling at 10s or so. We could achieve that with irq source masking and
> > >   scheduling the work item to do the hotplug handling with an (increasing)
> > >   delay if there's too many interrupts from a given hpd pin. But that
> > >   requires that we can mask hotplug interrupts properly, which seems to be
> > >   impossible with the PORT_HOTPLUG regs on gmch/SoC platforms :( So I
> > >   think your logic is nice enough ;-)
> > 
> > What you suggest would be possible with some small changes to my code I
> > guess. I just fear if we do have an IRQ storm 10s would be too short - on
> > a completely idle system this might be the prime source of wakeups.
> 
> I've misread your code and didn't realize that you rely on the output poll
> work for disabled hpd lines. I think the 2m delay in trying to re-enable
> outputs is more than fair enough. If we start to do fancy things with the
> DP/HDMI short pulses (i.e. reconfiguring downstream DP ports) we might
> need to reconsider the tuning values a bit. But the current values look
> sane to me.

Ok. It was an ad-hock choice trying to maintain a blanace between systems
which exhibit this behavior only under certain circumstances and others which
have it more or less permanently.

> 
> > I believe I can spare some hours to think about and work in your 
> > suggestions.
> 
> Awesome!

:)

Cheers,
	Egbert.



More information about the Intel-gfx mailing list