[Bug 103962] (DC 4.15-rc1) [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Nov 28 22:09:16 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=103962

            Bug ID: 103962
           Summary: (DC 4.15-rc1) [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR*
                    KCQ disabled failed (scratch(0xC040)=0xCAFEDEAD)
           Product: DRI
           Version: DRI git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: taijian at posteo.de

Created attachment 135785
  --> https://bugs.freedesktop.org/attachment.cgi?id=135785&action=edit
relevant dmesg output

I am testing the new DC code with 4.15-rc1 on my Alienware 15R3 with an RX470
dGPU.

This machine has forever had the problem that pm will not turn the dGPU off
properly when it is not in use, thus causing heat and battery life issues. I
had hoped that maybe the DC code would help here, but it seems to just have
switched the kinds of error messages I get around. Here are the new ones, I
keep getting a bunch of:

[    2.727021] [drm:dm_logger_write [amdgpu]] *ERROR*
hwss_edp_wait_for_hpd_ready: wait timed out!
[   20.520195] [drm:dm_logger_write [amdgpu]] *ERROR*
hwss_edp_wait_for_hpd_ready: wait timed out!

and

[   36.938909] [drm:dm_logger_write [amdgpu]] *ERROR*
hwss_edp_wait_for_hpd_ready: wait timed out!
[   37.087971] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0
test failed (scratch(0xC040)=0xCAFEDEAD)
[   37.087981] [drm:amdgpu_resume_phase2 [amdgpu]] *ERROR* resume of IP block
<gfx_v8_0> failed -22
[   37.087990] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_resume failed
(-22).
[   37.087993] snd_hda_intel 0000:01:00.1: Start delayed initialization
[   37.091937] input: HDA ATI HDMI HDMI/DP,pcm=3 as
/devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input21
[   37.091967] input: HDA ATI HDMI HDMI/DP,pcm=7 as
/devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input22
[   37.092023] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.103951] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.103974] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.104058] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.125739] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.248924] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.248946] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   37.249070] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory
with ring turned off.
[   42.269448] snd_hda_intel 0000:01:00.1: Disabling via vga_switcheroo
[   42.269517] snd_hda_intel 0000:01:00.1: Cannot lock devices!
[   42.447377] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   42.599606] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   42.749231] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   42.899207] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   43.047953] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   43.197043] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   43.344313] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   43.491648] [drm:gfx_v8_0_hw_fini [amdgpu]] *ERROR* KCQ disabled failed
(scratch(0xC040)=0xCAFEDEAD)
[   43.491814] WARNING: CPU: 7 PID: 512 at
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:600
dm_suspend+0x4e/0x60 [amdgpu]
[   43.491815] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
dns_resolver nfs lockd grace sunrpc fscache rfcomm ccm acpi_call(O) cmac bnep
arc4 ip6t_REJECT nf_reject_ipv6 xt_hl ip6t_rt joydev mousedev
snd_hda_codec_hdmi nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4
xt_multiport ath10k_pci rmi_smbus ath10k_core uvcvideo rmi_core
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core xt_limit ath
videodev xt_tcpudp snd_hda_codec_realtek btusb dell_wmi iTCO_wdt media btrtl
xt_addrtype fuse snd_hda_codec_generic iTCO_vendor_support mac80211 dell_smbios
wmi_bmof dell_wmi_descriptor intel_rapl nf_conntrack_ipv4 x86_pkg_temp_thermal
intel_powerclamp kvm_intel nf_defrag_ipv4 snd_hda_intel xt_conntrack kvm
snd_hda_codec snd_hda_core mei_me snd_hwdep nls_iso8859_1 nls_cp437 snd_pcm
[   43.491833]  input_leds irqbypass vfat snd_timer intel_cstate fat snd
processor_thermal_device cfg80211 psmouse led_class pcspkr i2c_i801
intel_rapl_perf mei soundcore intel_pch_thermal hci_uart intel_soc_dts_iosf wmi
btbcm btqca btintel int3403_thermal bluetooth int340x_thermal_zone battery
pinctrl_sunrisepoint pinctrl_intel ac ip6table_filter ecdh_generic tpm_crb
acpi_als tpm_tis ip6_tables intel_lpss_acpi dell_rbtn kfifo_buf i2c_hid
intel_lpss tpm_tis_core rfkill industrialio tpm intel_hid
nf_conntrack_netbios_ns shpchp nf_conntrack_broadcast evdev int3400_thermal
sparse_keymap acpi_pad acpi_thermal_rel mac_hid nf_nat_ftp nf_nat
nf_conntrack_ftp nf_conntrack libcrc32c crc32c_generic iptable_filter
sch_fq_codel coretemp msr crypto_user ip_tables x_tables ext4 crc16 mbcache
jbd2 fscrypto algif_skcipher af_alg hid_generic usbhid hid dm_crypt dm_mod dax
sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc
serio_raw atkbd libps2 ahci xhci_pci aesni_intel libahci xhci_hcd libata
aes_x86_64 crypto_simd usbcore glue_helper cryptd scsi_mod usb_common i8042
serio nvme nvme_core i915 amdgpu video button chash intel_gtt i2c_algo_bit
drm_kms_helper syscopyarea sysfillrect sysimgblt ttm fb_sys_fops drm agpgart
[   43.491865] CPU: 7 PID: 512 Comm: kworker/7:2 Tainted: G        W  O    
4.15.0-rc1-mainline #1
[   43.491866] Hardware name: Alienware Alienware 15 R3/Alienware 15 R3, BIOS
1.2.0 09/14/2017
[   43.491869] Workqueue: pm pm_runtime_work
[   43.491869] task: ffff96ecb5256ac0 task.stack: ffffa6b2c1e78000
[   43.491884] RIP: 0010:dm_suspend+0x4e/0x60 [amdgpu]
[   43.491884] RSP: 0018:ffffa6b2c1e7bcc0 EFLAGS: 00010286
[   43.491885] RAX: 0000000000000000 RBX: ffff96ecb5710000 RCX:
0000000000000000
[   43.491885] RDX: 0000000000000001 RSI: 0000000000000286 RDI:
0000000000000286
[   43.491886] RBP: 0000000000000003 R08: 0000000000000000 R09:
0000000000000495
[   43.491886] R10: 0000000000000002 R11: ffffffff8b14836d R12:
ffff96ecb5710000
[   43.491887] R13: ffffffffc03f3840 R14: 0000000000000000 R15:
ffff96ecb5485b30
[   43.491887] FS:  0000000000000000(0000) GS:ffff96eccedc0000(0000)
knlGS:0000000000000000
[   43.491888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   43.491888] CR2: 00007f978312d000 CR3: 000000018cc09005 CR4:
00000000003606e0
[   43.491889] Call Trace:
[   43.491898]  amdgpu_suspend+0x5f/0x150 [amdgpu]
[   43.491906]  amdgpu_device_suspend+0x1f0/0x3b0 [amdgpu]
[   43.491908]  ? vga_switcheroo_runtime_resume+0x50/0x50
[   43.491915]  amdgpu_pmops_runtime_suspend+0x52/0xc0 [amdgpu]
[   43.491916]  pci_pm_runtime_suspend+0x5c/0x160
[   43.491918]  vga_switcheroo_runtime_suspend+0x1e/0x90
[   43.491919]  __rpm_callback+0xb6/0x1e0
[   43.491920]  ? vga_switcheroo_runtime_resume+0x50/0x50
[   43.491921]  rpm_callback+0x1f/0x70
[   43.491922]  ? vga_switcheroo_runtime_resume+0x50/0x50
[   43.491923]  rpm_suspend+0x12d/0x670
[   43.491924]  ? __update_idle_core+0x20/0xb0
[   43.491925]  ? finish_task_switch+0x75/0x200
[   43.491926]  pm_runtime_work+0x64/0xa0
[   43.491928]  process_one_work+0x1de/0x410
[   43.491930]  worker_thread+0x2b/0x3d0
[   43.491931]  ? process_one_work+0x410/0x410
[   43.491931]  kthread+0x111/0x130
[   43.491932]  ? kthread_create_worker_on_cpu+0x70/0x70
[   43.491934]  ret_from_fork+0x1f/0x30
[   43.491935] Code: 9d 00 00 00 75 25 48 8b 7b 08 e8 0e dc e1 ff 48 8b bb a8
85 00 00 48 89 83 10 9d 00 00 be 08 00 00 00 e8 66 52 04 00 31 c0 5b c3 <0f> ff
eb d7 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 
[   43.491950] ---[ end trace 8dd55f26ae92c339 ]---
[   43.520969] amdgpu 0000:01:00.0: GPU pci config reset

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20171128/c83533bd/attachment-0001.html>


More information about the dri-devel mailing list