[Intel-gfx] [PATCH v2] drm/i915: Cancel the hotplug work when unregistering the connector

Manasi Navare manasi.d.navare at intel.com
Wed Oct 25 19:23:34 UTC 2017


On Wed, Oct 25, 2017 at 05:52:35PM +0100, Chris Wilson wrote:
> Quoting Maarten Lankhorst (2017-10-09 11:33:21)
> > Op 09-10-17 om 11:59 schreef Chris Wilson:
> > > Quoting Maarten Lankhorst (2017-10-09 10:41:29)
> > >> Op 06-10-17 om 19:18 schreef Chris Wilson:
> > >>> When we unregister the connector, we may have a pending hotplug work.
> > >>> This needs to be cancel early during the teardown so that it does not
> > >>> fire after we have freed the connector. Or else we may see something like:
> > >> Well the nice thing is even if it's called modeset_retry_work, it just sets the link status to bad for DP.
> > > Ok, and sends a hotplug event. At what point in the shutdown sequence
> > > does that drm_kms_helper_hotplug_event() become invalid?
> > Some digging makes me suspect at the very end of driver unload. So for this either is fine. But after some more digging, it's not enough, see below.
> > >> I worry it might be too early, wouldn't intel_dp_connector_destroy be a better place? At that point we know userspace can no longer use it,
> > >> because the last reference has been removed.
> > > connector_destroy is after drm_kms_helper_poll_fini(), so that seems
> > > suspect given a query about drm_kms_helper_hotplug_event()
> > >
> > > A hook from drm_atomic_helper_shutdown? Extending unregister to have a
> > > late phase?
> > >
> > Well after some more digging the only case where early_unregister vs unregister matters is DP-MST, where we can lose connectors dynamically.
> > As long as we never use modeset_retry_work on DP-MST connectors, everything is fine. Fortunately we don't do that, only used in DP connectors for now.
> >

This function actually would also get called for MST connectors since it gets scheduled anytime intel_dp_start_link_train() fails
and that can happen even for DP MST connectors.

This work function will get scheduled after the existing modeset is completed. So if the driver is unloaded before that, then
why dont we cancel this work first thing in the intel_modeset_cleanup()?
Now the question what if it was already scheduled and it has sent the hotplug event, how can we make sure the hotplu event is killed?

 
> > But if you look at the backtrace, the first error is from intel_fbdev_fini, so we need to cancel the work after poll_fini, and before fbdev_fini.
> > 
> > Fixing it in early_unregister is simply too late..
> 
> early_unregister is before fbdev_fini. You may mean that it's too early?
> 
> Anyway, shall we just revert

I dont think reverting this patch is gonna help since its gonna start
introducing random black screens as a result of unaddressed link failures that this
patch attempts on fixing.

> 
> commit 9301397a63b3bf1090dffe846c6f1c8efa032236
> Author: Manasi Navare <manasi.d.navare at intel.com>
> Date:   Thu Apr 6 16:44:19 2017 +0300
> 
>     drm/i915: Implement Link Rate fallback on Link training failure
> 
> as the authors seem not to care about the kernel oopses?
> -Chris

On it to find a proper place for adding this cancel work call to fix the kernel oopses.
Thanks for catching this.

Manasi


More information about the Intel-gfx mailing list