<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [regression]/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92050">92050</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[regression]/bug introduced by commit [0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for DPMS
</td>
</tr>
<tr>
<th>Product</th>
<td>DRI
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>DRM/Intel
</td>
</tr>
<tr>
<th>Assignee</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>jairo.daniel.miramontes.caton@intel.com
</td>
</tr>
<tr>
<th>QA Contact</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>CC</th>
<td>intel-gfx-bugs@lists.freedesktop.org
</td>
</tr></table>
<p>
<div>
<pre>Hello,
A regression/bug was introduced by commit
[0e572fe7383a376992364914694c39aa7fe44c1d] drm/i915: runtime PM support for
DPMS.
It causes my linux box to emit PATA and NIC errors (PCI bus stops working),
SATA is unaffected. The system freezes shorty after. Trouble starts within 12
hours, but not right away.
Git bisect was used to find the regression/bug. It was initially introduced
somewhere around kernel v3.15.0-rc8 (code has moved around since).
*** for details and logs, please scroll down ***
Hardware details:
- MSI MS-7732/PH61A-P35 (MS-7732), BIOS V2.4 09/19/2014
- Sandy Bridge Intel(R) Pentium(R) CPU G620
- 8 GiB RAM
- CRT monitor without EDID (not "plug and play") hooked up to the i915 CRT port
- 1x SSD connected to ASM1062 SATA controller
- 2x HDD connected to Intel SATA controller
- CDROM drive (ata9) and HDD (ata10) connected to Promise PDC20268
- Uncommon hardware configuration (old PCI PATA controller and PCI NIC)
Other:
- Running on LFS 7.1+ x86_64
- DPMS powersave enabled (setterm -powersave powerdown -powerdown 5)
- text based, no X
- Has VM's (QEMU KVM) running on top
Testing revealed:
v3.18.21: NOT OK
v3.17.8 : NOT OK
v3.17.0 : NOT OK
v3.16.7 : OK
v3.16.0 : OK
v3.14.51: OK
As a test, function intel_crtc_update_dpms() in
v3.18.20/drivers/gpu/drm/i915/intel_display.c was replaced with an older
revision (see below). The older revision works fine; it bypasses the function
intel_crtc_control().
So what is going wrong here? Perhaps something to do with the power domains,
lack of EDID? Or might it just trigger some bug/regression hidden deep in the
kernel?
I do not have the know-how to fix this. You guys do, ofcourse! :-)
/**
* Sets the power management mode of the pipe and plane.
*/
void intel_crtc_update_dpms(struct drm_crtc *crtc) {
struct drm_device *dev = crtc->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_encoder *intel_encoder;
bool enable = false;
for_each_encoder_on_crtc(dev, crtc, intel_encoder)
enable |= intel_encoder->connectors_active;
if (enable)
dev_priv->display.crtc_enable(crtc);
else
dev_priv->display.crtc_disable(crtc);
intel_crtc_update_sarea(crtc, enable); }
# /proc/cmdline
# noirqdebug: "ASM1083/1085 PCIe to PCI Bridge" has hardware issues, skips a
IRQ beat now and then root=LABEL=ROOTFS selinux=0 video=800x600-32@72
acpi_enforce_resources=lax thermal.off=1 noirqdebug
# /proc/interrupts
0: 14 0 IO-APIC-edge timer
1: 12 0 IO-APIC-edge i8042
4: 57380 0 IO-APIC-edge serial
5: 0 0 IO-APIC-edge parport0
8: 119 0 IO-APIC-edge rtc0
9: 3 0 IO-APIC-fasteoi acpi
12: 147 0 IO-APIC-edge i8042
16: 746 0 IO-APIC 16-fasteoi ehci_hcd:usb1
18: 7305 0 IO-APIC 18-fasteoi 0000:03:00.0
19: 64979 0 IO-APIC 19-fasteoi eth0
23: 33 0 IO-APIC 23-fasteoi ehci_hcd:usb3
24: 762 0 PCI-MSI-edge 0000:00:1f.2
25: 12926 0 PCI-MSI-edge 0000:06:00.0
26: 16017 0 PCI-MSI-edge eth1
27: 0 0 PCI-MSI-edge xhci_hcd
28: 0 0 PCI-MSI-edge xhci_hcd
29: 0 0 PCI-MSI-edge xhci_hcd
30: 517 0 PCI-MSI-edge snd_hda_intel
31: 7 0 PCI-MSI-edge i915
NMI: 0 0 Non-maskable interrupts
LOC: 1024537 205875 Local timer interrupts
SPU: 0 0 Spurious interrupts
PMI: 0 0 Performance monitoring interrupts
IWI: 0 0 IRQ work interrupts
RTR: 1 0 APIC ICR read retries
RES: 17909 21506 Rescheduling interrupts
CAL: 7297 7219 Function call interrupts
TLB: 346 209 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
MCE: 0 0 Machine check exceptions
MCP: 11 11 Machine check polls
ERR: 0
MIS: 0
# lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family
DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series/C200 Series
Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family
USB Enhanced Host Controller #2 (rev 05)
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset Family
High Definition Audio Controller (rev 05)
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI
Express Root Port 1 (rev b5)
00:1c.2 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5)
00:1c.3 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI
Express Root Port 4 (rev b5)
00:1c.4 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI
Express Root Port 5 (rev b5)
00:1c.5 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI
Express Root Port 6 (rev b5)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family
USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation H61 Express Chipset Family LPC Controller
(rev 05)
00:1f.2 SATA controller: Intel Corporation 6 Series/C200 Series Chipset Family
SATA AHCI Controller (rev 05)
00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family SMBus
Controller (rev 05)
02:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge
(rev 01)
03:00.0 Mass storage controller: Promise Technology, Inc. PDC20268
[Ultra100 TX2] (rev 02)
03:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 74)
04:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host
Controller
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
06:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller
(rev 01)
# logging: so it begins
Sep 8 04:13:06 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8
04:18:13 syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 04:19:55
syscore kernel: ata9: drained 2 bytes to clear DRQ Sep 8 04:37:42 syscore
kernel: ata9: drained 2 bytes to clear DRQ Sep 8 04:41:58 syscore kernel:
ata9: drained 2 bytes to clear DRQ Sep 8 04:47:06 syscore kernel: ata9:
drained 2 bytes to clear DRQ
# logging: and it ends like this (kernel lock-up if unlucky) Jun 14 18:32:59
syscore kernel: ata9: drained 2 bytes to clear DRQ Jun 14 18:33:04 syscore
kernel: ata9.00: qc timeout (cmd 0xa0) Jun 14 18:33:04 syscore kernel: ata9.00:
exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Jun 14 18:33:16 syscore
kernel: sr 8:0:0:0: CDB:
Jun 14 18:33:16 syscore kernel: Get event status notification: 4a 01 00
00 10 00 00 00 08 00
Jun 14 18:33:16 syscore kernel: ata9.00: cmd
a0/00:00:00:08:00/00:00:00:00:00/a0 tag 0 pio 16392 in Jun 14 18:33:16 syscore
kernel: res 51/54:03:00:08:00/00:00:00:00:00/a0
Emask 0x5 (timeout)
Jun 14 18:33:16 syscore kernel: ata9.00: status: { DRDY ERR } Jun 14 18:33:16
syscore kernel: ata9: soft resetting link Jun 14 18:33:16 syscore kernel: eth0:
Transmit error, Tx status register ff.
Jun 14 18:33:16 syscore kernel: Flags; bus-master 1, dirty 51677(13) current
51677(13) Jun 14 18:33:16 syscore kernel: Transmit list ffffffff vs.
ffff8800d781e9b8.
Jun 14 18:33:16 syscore kernel: 0: @ffff8800d781e200 length 8000008d status
0001008d Jun 14 18:33:16 syscore kernel: 1: @ffff8800d781e298 length 80000048
status 00010048 Jun 14 18:33:16 syscore kernel: 2: @ffff8800d781e330 length
800000b7 status 000100b7 Jun 14 18:33:16 syscore kernel: 3: @ffff8800d781e3c8
length 800000a7 status 000100a7 Jun 14 18:33:16 syscore kernel: 4:
@ffff8800d781e460 length 00000080 status 00010102 Jun 14 18:33:16 syscore
kernel: 5: @ffff8800d781e4f8 length 80000042 status 00010042 Jun 14 18:33:16
syscore kernel: 6: @ffff8800d781e590 length 80000042 status 00010042 Jun 14
18:33:16 syscore kernel: 7: @ffff8800d781e628 length 80000042 status 00010042
Jun 14 18:33:16 syscore kernel: 8: @ffff8800d781e6c0 length 80000042 status
00010042 Jun 14 18:33:16 syscore kernel: 9: @ffff8800d781e758 length 80000042
status 00010042 Jun 14 18:33:16 syscore kernel: 10: @ffff8800d781e7f0 length
00000080 status 0001014d Jun 14 18:33:16 syscore kernel: 11: @ffff8800d781e888
length 80000048 status 00010048 Jun 14 18:33:16 syscore kernel: 12:
@ffff8800d781e920 length 800000a7 status 800100a7 Jun 14 18:33:16 syscore
kernel: 13: @ffff8800d781e9b8 length 80000042 status 00010042 Jun 14 18:33:16
syscore kernel: 14: @ffff8800d781ea50 length 80000042 status 00010042 Jun 14
18:33:16 syscore kernel: 15: @ffff8800d781eae8 length 00000080 status 00010102
Jun 14 18:33:16 syscore kernel: eth0: Updating statistics failed, disabling
stats as an interrupt source Jun 14 18:33:16 syscore kernel: eth0: Host error,
FIFO diagnostic register ffff.
Jun 14 18:33:16 syscore kernel: eth0: PCI bus error, bus status ffffffff Jun 14
18:33:16 syscore kernel: eth0: setting full-duplex.
Jun 14 18:33:16 syscore kernel: eth0: command 0x5800 did not complete!
Status=0xffff
Jun 14 18:33:16 syscore kernel: sched: RT throttling activated Jun 14 18:33:16
syscore kernel: ata9.00: qc timeout (cmd 0xa1) Jun 14 18:33:16 syscore kernel:
ata9.00: failed to IDENTIFY (I/O error,
err_mask=0x5)
Jun 14 18:33:16 syscore kernel: ata9.00: revalidation failed (errno=-5) Jun 14
18:33:16 syscore kernel: ata9: soft resetting link Jun 14 18:33:16 syscore
kernel: pata_pdc2027x: 40-conductor cable detected on port 0 Jun 14 18:33:16
syscore kernel: ata9.00: configured for MWDMA2 Jun 14 18:33:16 syscore kernel:
ata9: EH complete
# logging: more of the same
Sep 10 15:34:21 syscore kernel: ata10.00: exception Emask 0x0 SAct 0x0 SErr 0x0
action 0x0 Sep 10 15:34:21 syscore kernel: ata10.00: BMDMA stat 0xff Sep 10
15:34:21 syscore kernel: ata10.00: failed command: READ DMA EXT Sep 10 15:34:21
syscore kernel: ata10.00: cmd
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 15:34:21 syscore
kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
Emask 0x20 (host bus error)
Sep 10 15:34:21 syscore kernel: ata10.00: status: { DRDY } Sep 10 15:34:22
syscore kernel: ata10.00: configured for UDMA/100 Sep 10 15:34:22 syscore
kernel: ata10: EH complete Sep 10 15:34:23 syscore kernel: ata10: drained 202
bytes to clear DRQ Sep 10 15:34:23 syscore kernel: ata10.00: exception Emask
0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 10 15:34:23 syscore kernel: ata10.00:
BMDMA stat 0xff Sep 10 15:34:23 syscore kernel: ata10.00: failed command: READ
DMA EXT Sep 10 15:34:23 syscore kernel: ata10.00: cmd
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 15:34:23 syscore
kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
Emask 0x20 (host bus error)
Sep 10 15:34:23 syscore kernel: ata10.00: status: { DRDY } Sep 10 15:34:23
syscore kernel: ata10.00: n_sectors mismatch 586114704 != 268435455 Sep 10
15:34:23 syscore kernel: ata10.00: old n_sectors matches native, probably late
HPA lock, will try to unlock HPA Sep 10 15:34:23 syscore kernel: ata10.00:
revalidation failed (errno=-5) Sep 10 15:34:23 syscore kernel: ata10: soft
resetting link Sep 10 15:34:33 syscore kernel: eth0: Transmit error, Tx status
register ff.
Sep 10 15:34:33 syscore kernel: Flags; bus-master 1, dirty 36208(0) current
36208(0) Sep 10 15:34:33 syscore kernel: Transmit list ffffffff vs.
ffff88001fafc200.
Sep 10 15:34:33 syscore kernel: 0: @ffff88001fafc200 length 80000071 status
00010071 Sep 10 15:34:33 syscore kernel: 1: @ffff88001fafc298 length 80000036
status 00010036 Sep 10 15:34:33 syscore kernel: 2: @ffff88001fafc330 length
8000005c status 0001005c Sep 10 15:34:33 syscore kernel: 3: @ffff88001fafc3c8
length 80000036 status 00010036 Sep 10 15:34:33 syscore kernel: 4:
@ffff88001fafc460 length 80000036 status 00010036 Sep 10 15:34:33 syscore
kernel: 5: @ffff88001fafc4f8 length 80000064 status 00010064 Sep 10 15:34:33
syscore kernel: 6: @ffff88001fafc590 length 8000002a status 0001002a Sep 10
15:34:33 syscore kernel: 7: @ffff88001fafc628 length 80000064 status 00010064
Sep 10 15:34:33 syscore kernel: 8: @ffff88001fafc6c0 length 80000036 status
00010036 Sep 10 15:34:33 syscore kernel: 9: @ffff88001fafc758 length 80000064
status 00010064 Sep 10 15:34:33 syscore kernel: 10: @ffff88001fafc7f0 length
80000055 status 00010055 Sep 10 15:34:33 syscore kernel: 11: @ffff88001fafc888
length 80000036 status 00010036 Sep 10 15:34:33 syscore kernel: 12:
@ffff88001fafc920 length 8000002a status 0001002a Sep 10 15:34:33 syscore
kernel: 13: @ffff88001fafc9b8 length 8000004a status 0c01004a Sep 10 15:34:33
syscore kernel: 14: @ffff88001fafca50 length 8000004a status 0c01004a Sep 10
15:34:33 syscore kernel: 15: @ffff88001fafcae8 length 8000004a status 8c01004a
Sep 10 15:34:33 syscore kernel: eth0: Updating statistics failed, disabling
stats as an interrupt source.
Sep 10 15:34:33 syscore kernel: eth0: Host error, FIFO diagnostic register
ffff.
Sep 10 15:34:33 syscore kernel: eth0: PCI bus error, bus status ffffffff Sep 10
15:34:34 syscore kernel: eth0: setting full-duplex.
Sep 10 15:34:34 syscore kernel: eth0: command 0x5800 did not complete!
Status=0xffff
Sep 10 15:34:34 syscore kernel: [sched_delayed] sched: RT throttling activated
Sep 10 15:34:34 syscore kernel: ata10.00: configured for UDMA/100 Sep 10
15:34:34 syscore kernel: ata10: EH complete Sep 10 15:34:34 syscore kernel:
ata10: drained 292 bytes to clear DRQ Sep 10 15:34:34 syscore kernel: ata10.00:
exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 10 15:34:34 syscore
kernel: ata10.00: BMDMA stat 0xff Sep 10 15:34:34 syscore kernel: ata10.00:
failed command: READ DMA EXT Sep 10 15:34:34 syscore kernel: ata10.00: cmd
25/00:00:48:d6:fd/00:01:21:00:00/e0 tag 0 dma 131072 in Sep 10 15:34:34 syscore
kernel: res 50/00:00:47:d7:fd/00:00:21:00:00/e0
Emask 0x20 (host bus error)
Sep 10 15:34:34 syscore kernel: ata10.00: status: { DRDY } Sep 10 15:34:34
syscore kernel: ata10.00: configured for UDMA/100 Sep 10 15:34:34 syscore
kernel: ata10: EH complete
Kind regards,
Frank de Jong
This bug was raised to the Intel-Gfx mailing list by Frank de Jong
<<a href="mailto:frapex@xs4all.nl">frapex@xs4all.nl</a>></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
<li>You are on the CC list for the bug.</li>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>