[Intel-gfx] [PATCH] drm/i915: kicking rings considered harmful

Ben Widawsky ben at bwidawsk.net
Tue Sep 27 07:22:01 CEST 2011


On Mon, 26 Sep 2011 19:59:50 +0200
Daniel Vetter <daniel.vetter at ffwll.ch> wrote:

> Only do it in the hope of resurrecting the gpu. Disable when reset is
> disabled because it seems to tremendously increases our changes to
> actually capture an error_state before the system goes all belly-up.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> ---
> Hi Andrew,
> 
> Can you please apply this patch and boot your system with
> 
> i915.reset=0 i915.semaphores=1
> 
> and rehang your gpu? This patch to fully disable any attempts at
> resurrecting a dead gpu hopefully prevents the full system hang you're
> experiencing. At least it helps greatly here on my systems.
> 
> If the systems isn't completely dead with this, can you please ssh
> into the machine and grabe dmesg, i915_error_state, Xorg.log and
> whatever else there might be?
> 
> Thanks a lot,
> 
> Daniel
> 
>  drivers/gpu/drm/i915/i915_drv.c |    2 +-
>  drivers/gpu/drm/i915/i915_drv.h |    1 +
>  drivers/gpu/drm/i915/i915_irq.c |    2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c index b79c6f1..ad85c13 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -91,7 +91,7 @@ MODULE_PARM_DESC(vbt_sdvo_panel_type,
>  		"Override selection of SDVO panel mode in the VBT "
>  		"(default: auto)");
>  
> -static bool i915_try_reset __read_mostly = true;
> +bool i915_try_reset __read_mostly = true;
>  module_param_named(reset, i915_try_reset, bool, 0600);
>  MODULE_PARM_DESC(reset, "Attempt GPU resets (default: true)");
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 3621336..788a801 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -995,6 +995,7 @@ extern unsigned int i915_semaphores __read_mostly;
>  extern unsigned int i915_lvds_downclock __read_mostly;
>  extern unsigned int i915_panel_use_ssc __read_mostly;
>  extern int i915_vbt_sdvo_panel_type __read_mostly;
> +extern bool i915_try_reset __read_mostly;
>  extern unsigned int i915_enable_rc6 __read_mostly;
>  extern unsigned int i915_enable_fbc __read_mostly;
>  extern bool i915_enable_hangcheck __read_mostly;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c
> b/drivers/gpu/drm/i915/i915_irq.c index da5d607..09c11e4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1694,7 +1694,7 @@ void i915_hangcheck_elapsed(unsigned long data)
>  		if (dev_priv->hangcheck_count++ > 1) {
>  			DRM_ERROR("Hangcheck timer elapsed... GPU
> hung\n"); 
> -			if (!IS_GEN2(dev)) {
> +			if (!IS_GEN2(dev) && i915_try_reset) {
>  				/* Is the chip hanging on a
> WAIT_FOR_EVENT?
>  				 * If so we can simply poke the
> RB_WAIT bit
>  				 * and break the hang. This should
> work on

I think you should also be able to accomplish the same thing
with enable_hangcheck param. I had the same problem with the
debugger :)

Ben



More information about the Intel-gfx mailing list