[Intel-gfx] [PATCH v2] drm/i915: Don't warn if we restore pm interrupts during reset

Mika Kuoppala mika.kuoppala at linux.intel.com
Thu Aug 14 17:43:20 CEST 2014


Mika Kuoppala <mika.kuoppala at linux.intel.com> writes:

> Daniel Vetter <daniel at ffwll.ch> writes:
>
>> On Thu, Aug 14, 2014 at 03:46:43PM +0300, Mika Kuoppala wrote:
>>> We lost the software state tracking due to reset, so don't
>>> complain if it doesn't match.
>>
>> This sounds more like gpu reset should be a bit more careful (even more
>> careful than we already are compared to earlier kernels) with making sure
>> the irq state is still sane after a reset?
>>
>> Or what exactly is the failure mode here? The commit message lacks a bit
>> details in form of a nice text or even better: A testcase ;-)
>
> We have pm ref during reset. And then after reset, we kick 
> intel_gt_reset_powersave to re-enable the rps. Countrary to
> suspend/thaw, we never disabled the interrupts. And the warn
> triggers.
>
> I tried to disable the interrupts during reset handling but the
> nonblocking __wait_seqno() triggered another state warning 
> it was taking a pm ref during or right after reset recovery for hw
> access.
> -Mika
>

Pretty difficult to hit also. I needed multiple tries of 
ctrl-c the process that submitted the hang and have a another
client running in background doing gpu access.

Timing issue related that we enable the rps through delayed workqueue?

Here is the trace:
[  635.478701] [drm] Simulated gpu hang, resetting stop_rings
[  637.457126] ------------[ cut here ]------------
[  637.458711] WARNING: CPU: 5 PID: 3595 at
drivers/gpu/drm/i915/intel_pm.c:3607
gen6_enable_rps_interrupts+0x72/0x80 [i915]()
[  637.460361] Modules linked in: i915 drm_kms_helper drm kvm_intel kvm
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm
snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq mxm_wmi snd_timer
snd_seq_device psmouse snd serio_raw ehci_pci bnep ehci_hcd rfcomm
soundcore bluetooth wmi mac_hid parport_pc ppdev lp parport dm_crypt
usbhid firewire_ohci firewire_core crc_itu_t e1000e ptp pps_core
xhci_hcd usbcore i2c_algo_bit video usb_common [last unloaded: drm]
[  637.468170] CPU: 5 PID: 3595 Comm: kworker/5:0 Tainted: G        W
3.16.0+ #240
[  637.469545] Workqueue: events intel_gen6_powersave_work [i915]
[  637.471042]  00000000 00000000 ca0d3e54 c15adcca f8898260 ca0d3e84
c1047224 c17536b0
[  637.472616]  00000005 00000e0b f8898260 00000e17 f87ff852 f87ff852
f6ec8000 f6ecbe68
[  637.474301]  ee851c00 ca0d3e94 c1047262 00000009 00000000 ca0d3ea8
f87ff852 f6ec8000
[  637.475920] Call Trace:
[  637.477504]  [<c15adcca>] dump_stack+0x48/0x60
[  637.479060]  [<c1047224>] warn_slowpath_common+0x84/0xa0
[  637.480708]  [<f87ff852>] ? gen6_enable_rps_interrupts+0x72/0x80
[i915]
[  637.481880]  [<f87ff852>] ? gen6_enable_rps_interrupts+0x72/0x80
[i915]
[  637.483220]  [<c1047262>] warn_slowpath_null+0x22/0x30
[  637.484258]  [<f87ff852>] gen6_enable_rps_interrupts+0x72/0x80 [i915]
[  637.485503]  [<f8808ecd>] intel_gen6_powersave_work+0x57d/0x1020
[i915]
[  637.486516]  [<c105e8bc>] process_one_work+0x10c/0x3c0
[  637.487630]  [<c105f523>] worker_thread+0xf3/0x470
[  637.488618]  [<c105f430>] ? create_and_start_worker+0x50/0x50
[  637.489802]  [<c1064cdb>] kthread+0x9b/0xb0
[  637.490804]  [<c15b4e01>] ret_from_kernel_thread+0x21/0x30
[  637.491872]  [<c1064c40>] ? flush_kthread_worker+0xb0/0xb0
[  637.492862] ---[ end trace b31c16cec8a7abaa ]---

-Mika

>> Thanks, Daniel
>>
>>> 
>>> v2: fix build error
>>> 
>>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/intel_pm.c |    6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>>> index 12f4e14..7a1309c 100644
>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>> @@ -3593,7 +3593,8 @@ static void gen8_enable_rps_interrupts(struct drm_device *dev)
>>>  	struct drm_i915_private *dev_priv = dev->dev_private;
>>>  
>>>  	spin_lock_irq(&dev_priv->irq_lock);
>>> -	WARN_ON(dev_priv->rps.pm_iir);
>>> +	if (!i915_reset_in_progress(&dev_priv->gpu_error))
>>> +		WARN_ON(dev_priv->rps.pm_iir);
>>>  	gen8_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
>>>  	I915_WRITE(GEN8_GT_IIR(2), dev_priv->pm_rps_events);
>>>  	spin_unlock_irq(&dev_priv->irq_lock);
>>> @@ -3604,7 +3605,8 @@ static void gen6_enable_rps_interrupts(struct drm_device *dev)
>>>  	struct drm_i915_private *dev_priv = dev->dev_private;
>>>  
>>>  	spin_lock_irq(&dev_priv->irq_lock);
>>> -	WARN_ON(dev_priv->rps.pm_iir);
>>> +	if (!i915_reset_in_progress(&dev_priv->gpu_error))
>>> +		WARN_ON(dev_priv->rps.pm_iir);
>>>  	gen6_enable_pm_irq(dev_priv, dev_priv->pm_rps_events);
>>>  	I915_WRITE(GEN6_PMIIR, dev_priv->pm_rps_events);
>>>  	spin_unlock_irq(&dev_priv->irq_lock);
>>> -- 
>>> 1.7.9.5
>>> 
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>> -- 
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



More information about the Intel-gfx mailing list