[Intel-gfx] [KERNEL] Regression bug in drm/i915, Wrong assumption in commit e11aa36 breaks suspend on at least lenovo x61

Dave Gordon david.s.gordon at intel.com
Thu Feb 19 02:47:53 PST 2015


On 18/02/15 16:24, Imre Deak wrote:
> On ke, 2015-02-18 at 17:39 +0200, Jani Nikula wrote:
>> On Tue, 17 Feb 2015, Klaus Ethgen <Klaus+lkml at ethgen.de> wrote:
>>> After solving  the conflicts, I applied the revert (see attachment) to
>>> v3.18.7. I think it should also apply to the current head. With that
>>> patch, suspend is working again on that version.
>>>
>>> However, I have not to deep knowledge of that subsystem, so please,
>>> someone who have, have a deeper look into it. I especially do not know
>>> if the lines in .../intel_pm.c are correct or better leaving them as
>>> they are in v3.18.7.
>>>
>>> I want to have it working on a version that I know is stable before
>>> asking to pull it to head.
>>
>> Hi Klaus, we fear this patch may hide the actual cause. Would be useful
>> to get a better description of what happens, along with a dmesg with
>> drm.debug=14 module parameter set. This might clutter the mailing list,
>> would you mind filing a bug at [1] and attach the info there?
>>
>> Thanks,
>> Jani.
>>
>> [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel
> 
> In addition to the above could you also try the following patch and
> provide a dmesg log on the bugzilla ticket - at this point only to try
> to narrow down the issue?:
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index d358ce8..02c65f4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -4466,6 +4466,14 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
>  		I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT |
>  		I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT;
>  
> +	if (!intel_irqs_enabled(dev_priv)) {
> +		if (printk_ratelimit())
> +			DRM_ERROR("spurious/shared interrupt iir %08x imr %08x ier %08x\n",
> +				  I915_READ(IIR), I915_READ(IMR), I915_READ(IER));
> +
> +		return IRQ_NONE;
> +	}
> +
>  	iir = I915_READ(IIR);
>  
>  	for (;;) {
> @@ -4766,7 +4774,10 @@ void intel_runtime_pm_disable_interrupts(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
>  	dev->driver->irq_uninstall(dev);
> +
> +	spin_lock_irq(&dev_priv->irq_lock);
>  	dev_priv->pm._irqs_disabled = true;
> +	spin_unlock_irq(&dev_priv->irq_lock);
>  }
>  
>  /* Restore interrupts so we can recover from runtime PM. */
> @@ -4774,7 +4785,10 @@ void intel_runtime_pm_restore_interrupts(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> +	spin_lock_irq(&dev_priv->irq_lock);
>  	dev_priv->pm._irqs_disabled = false;
> +	spin_unlock_irq(&dev_priv->irq_lock);
> +
>  	dev->driver->irq_preinstall(dev);
>  	dev->driver->irq_postinstall(dev);
>  }

Surely surrounding (what ought to be) an atomic assignment to a single
variable cannot make a difference? Unless it's the memory barrier
semantics that have some effect? It seems unlikely that the compiler has
deferred the write to the variable past the pre/postinstall calls that
actually enable the h/w interrupts, but maybe we should add "volatile"
just in case?

.Dave.


More information about the dri-devel mailing list