[Intel-gfx] [PATCH] drm/i915: Protect fbdev across slow or failed initialisation

Daniel Vetter daniel at ffwll.ch
Mon Feb 15 16:32:51 UTC 2016


On Fri, Feb 05, 2016 at 03:58:31PM +0100, Lukas Wunner wrote:
> Hi Chris,
> 
> On Fri, Feb 05, 2016 at 11:09:27AM +0000, Chris Wilson wrote:
> > On Fri, Feb 05, 2016 at 01:27:10AM +0100, Lukas Wunner wrote:
> > > On Thu, Feb 04, 2016 at 09:21:17AM +0000, Li, Weinan Z wrote:
> > > > We still need this patch. Seems 54632abe8ca3 ("drm/i915: Fix oops caused by fbdev initialization
> > > > failure") as well as 366e39b4d2c5 ("drm/i915: Tear down fbdev if
> > > > initialization fails") this 2 patches can???t cover this case. It???s access ifbdev->fb before the initialization
> > > > finished, but not initialization failed. If don???t have any other patches or code update relative, it may still have in 4.5.
> > > 
> > > Okay, I see.
> > > 
> > > > 
> > > > add info NULL check should be better, it is also initialized in the async queue
> > > > >       info = ifbdev->helper.fbdev;
> > > > > +     if (info == NULL)
> > > > > +            return false;
> > > > >       if (!info->screen_base)
> > > 
> > > So if there's indeed a race between fbdev initialization and the call to
> > > intel_fbdev_restore_mode() (on lastclose), there's more that can go awry:
> > > - intel_fbdev_restore_mode() dereferences ifbdev->fb->obj
> > > - it calls __drm_atomic_helper_set_config() which WARNs if ifbdev->fb is NULL
> > > - it calls drm_fb_helper_set_par() which dereferences ifbdev->helper.fbdev
> > > 
> > > Instead of checking all that it's probably simpler to call
> > > async_synchronize_full() at the beginning of intel_fbdev_restore_mode(),
> > > as Li Weinan suggested. I'll submit the corresponding one-liner patch
> > > tomorrow if noone else does.
> > > 
> > > Chris' patch also modified intel_fbdev_set_suspend() as well as
> > > intel_fbdev_output_poll_changed(), not sure if these are racy as well.
> > > At least the stack traces posted by Li Weinan and Gustav Fägerlind
> > > only indicate that lastclose is racy.
> > 
> > We called set-suspend internally, but we don't do any checks before
> > hand. I was concerned we may be able to get into a situation where
> > .fb_probe fails and leaves a dangling dev_priv->ifbdev for which we
> > would then trip over the NULL info->screen_base. So I was looking for a
> > suitable guard.
> 
> Yes, looking at this with a fresh pair of eyeballs I think you were
> totally right, i915_pm_ops is declared as part of i915_pci_driver and
> it might indeed happen that i915_pm_suspend() is called before the fbdev
> is fully initialized.
> 
> As for intel_fbdev_output_poll_changed(), there's even a comment in
> i915_load_modeset_init() stating that hotplug events might occur
> during fdbev initial config.
> 
> Below patch adds the requisite async_synchronize_full() to the three
> functions that you also modified and updates that comment.
> 
> However AFAICT a corner case remains where we're still vulnerable to races:
> async_schedule() runs synchronously "if we're out of memory or if there's
> too much work pending already" (see __async_schedule() in kernel/async.c).
> In this case no entry is added to the pending list and
> async_synchronize_full() might theoretically return before the synchronously
> executed function has finished.
> 
> The existing call to async_synchronize_full() in intel_fbdev_fini()
> seems to be susceptible to the same race condition, but it's probably
> very hard to trigger it. (Unload the module before the fbdev is fully
> initialized.)
> 
> To make it bullet proof we could declare a completion in intel_fbdev.c
> and wait for that instead. Opinions?
> 
> Thanks,
> 
> Lukas
> 
> -- >8 --
> 
> Subject: [PATCH] drm/i915: Fix races on fbdev
> 
> The ->lastclose callback invokes intel_fbdev_restore_mode() and has
> been witnessed to run before intel_fbdev_initial_config_async()
> has finished.
> 
> We might likewise receive hotplug events or be suspended before
> we've had a chance to fully set up the fbdev.
> 
> Fix by waiting for the asynchronous thread to finish.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580
> Reported-by: Gustav FÀgerlind <gustav.fagerlind at gmail.com>
> Reported-by: "Li, Weinan Z" <weinan.z.li at intel.com>
> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: stable at vger.kernel.org
> Signed-off-by: Lukas Wunner <lukas at wunner.de>

Can you pls resubmit this patch stand-alone? Patchwork (and therefore CI)
won't pick up inlined patches like this one here unfortunately.

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_dma.c    | 8 +++-----
>  drivers/gpu/drm/i915/intel_fbdev.c | 4 ++++
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index a0f5659..a76b528 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -437,11 +437,9 @@ static int i915_load_modeset_init(struct drm_device *dev)
>  	 * Some ports require correctly set-up hpd registers for detection to
>  	 * work properly (leading to ghost connected connector status), e.g. VGA
>  	 * on gm45.  Hence we can only set up the initial fbdev config after hpd
> -	 * irqs are fully enabled. Now we should scan for the initial config
> -	 * only once hotplug handling is enabled, but due to screwed-up locking
> -	 * around kms/fbdev init we can't protect the fdbev initial config
> -	 * scanning against hotplug events. Hence do this first and ignore the
> -	 * tiny window where we will loose hotplug notifactions.
> +	 * irqs are fully enabled. We protect the fbdev initial config scanning
> +	 * against hotplug events by waiting in intel_fbdev_output_poll_changed
> +	 * until the asynchronous thread has finished.
>  	 */
>  	intel_fbdev_initial_config_async(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index 09840f4..2002b13 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -749,6 +749,7 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>  	struct intel_fbdev *ifbdev = dev_priv->fbdev;
>  	struct fb_info *info;
>  
> +	async_synchronize_full();
>  	if (!ifbdev)
>  		return;
>  
> @@ -795,6 +796,8 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
>  void intel_fbdev_output_poll_changed(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +
> +	async_synchronize_full();
>  	if (dev_priv->fbdev)
>  		drm_fb_helper_hotplug_event(&dev_priv->fbdev->helper);
>  }
> @@ -806,6 +809,7 @@ void intel_fbdev_restore_mode(struct drm_device *dev)
>  	struct intel_fbdev *ifbdev = dev_priv->fbdev;
>  	struct drm_fb_helper *fb_helper;
>  
> +	async_synchronize_full();
>  	if (!ifbdev)
>  		return;
>  
> -- 
> 1.8.5.2 (Apple Git-48)
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list