[OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
Michel Dänzer
michel at daenzer.net
Mon Mar 16 19:31:00 PDT 2015
On 16.03.2015 23:52, Carsten Emde wrote:
> Hi Michel,
>
>>> [..]
>>> The most striking problem of kernel 3.18.9-rt4 affects all systems that
>>> are equipped with Radeon graphics (irrespective whether PCIe cards or
>>> APUs with on-chip graphics). They suffer from a hanging radeon driver.
>>> The block occurs when accelerated graphics load is created by x11perf or
>>> gltestperf. Sometimes only the graphics are frozen while ssh login still
>>> is possible, somtimes the entire box is no longer accessible at all. In
>>> any case, a reboot is needed to recover from this situation.
>>>
>>> Here is a selection of kernel messages:
>> [...]
>> The commits from
>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=f957063fee6392bb9365370db6db74dc0b2dce0a
>>
>> to
>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=cffefd9bb31cd35ab745d3b49005d10616d25bdc
>>
>> and
>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=b6610101718d4ab90d793c482625e98eb1262cad
>>
>> might help for this.
>
> Thanks a lot. I have applied these patches to a number of systems:
> # quilt applied | tail -7
> patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch
> patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch
> patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch
> patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch
> patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch
> patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch
> patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch
>
>
> The graphic boards still crash and freeze the screen, but in contrast
> to the earlier situation the systems remain accessible, and the X
> Window server can be restarted after the offensive programs are
> removed. The crashes were reliably triggered by
> - gltestperf
> or
> - x11perf -repeat 3 -subs 25 -time 2 -rect10
> but the crashes also occur several times per day during normal work
> such as browsing the Internet or writing a text document. If you wish
> me to provide additional diagnostic information such as running test
> programs while the graphic boards are unresponsive, I certainly can do
> that.
Does it also happen with a kernel built from a current drm-fixes tree?
http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes
I might have missed other needed fixes.
> Rack #0/Slot #3 [AMD/ATI] RV730 XT [Radeon HD 4670]:
>
> [21001.244036] INFO: task kworker/u24:6:267 blocked for more than 120 seconds.
> [21001.257773] Not tainted 3.18.9-rt4 #27
> [21001.266284] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [21001.281911] kworker/u24:6 D ffff88081ed8b340 0 267 2 0x10000000
> [21001.281937] Workqueue: radeon-crtc radeon_flip_work_func [radeon]
> [21001.281940] ffff880805d2fbe8 0000000000000046 ffff88081ed0c700 0000000000000000
> [21001.281941] 0000000000009000 000000000000c920 ffff8808112fb420 ffff880035254e30
> [21001.281943] 000000000000c280 000001000000c280 0000000000000003 ffff880035254e30
> [21001.281945] Call Trace:
> [21001.281950] [<ffffffff81721ce4>] schedule+0x34/0xa0
> [21001.281953] [<ffffffff8172425c>] schedule_timeout+0x22c/0x2d0
> [21001.281962] [<ffffffffa0439a06>] ? radeon_fence_process+0x16/0x40 [radeon]
> [21001.281971] [<ffffffffa0439a74>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon]
> [21001.281979] [<ffffffffa0439da7>] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon]
> [21001.281982] [<ffffffff81098be0>] ? __wake_up_sync+0x20/0x20
> [21001.281991] [<ffffffffa043a106>] radeon_fence_wait+0x86/0xc0 [radeon]
> [21001.282000] [<ffffffffa0447eec>] radeon_flip_work_func+0x15c/0x190 [radeon]
> [21001.282003] [<ffffffff810709c4>] process_one_work+0x154/0x450
> [21001.282004] [<ffffffff81070fbb>] worker_thread+0x6b/0x4d0
> [21001.282006] [<ffffffff81070f50>] ? rescuer_thread+0x290/0x290
> [21001.282007] [<ffffffff81070f50>] ? rescuer_thread+0x290/0x290
> [21001.282009] [<ffffffff81075fed>] kthread+0xcd/0xf0
> [21001.282010] [<ffffffff81075f20>] ? kthread_worker_fn+0x1d0/0x1d0
> [21001.282013] [<ffffffff81725aec>] ret_from_fork+0x7c/0xb0
> [21001.282014] [<ffffffff81075f20>] ? kthread_worker_fn+0x1d0/0x1d0
>
>
> Rack #0/Slot #7 [AMD/ATI] Cayman XT [Radeon HD 6970]
>
> [ 481.091132] INFO: task Xorg:3459 blocked for more than 120 seconds.
> [ 481.103594] Not tainted 3.18.9-rt4 #28
> [ 481.112101] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 481.127746] Xorg D ffff88041e68ab40 0 3459 3452 0x10400004
> [ 481.141882] ffff880413da38e8 0000000000000002 ffff88041e60c460 ffff8800c3ea3380
> [ 481.141882] ffff880413da38d8 ffffffff8108603f 000000000000c5a8 000000000000c5c8
> [ 481.141883] ffffffff81c19460 ffff8800c3ea3380 000000000000000c ffff8800c3ea3380
> [ 481.186228] Call Trace:
> [ 481.191114] [<ffffffff8108603f>] ? queue_delayed_work_on+0xff/0x110
> [ 481.191118] [<ffffffff816b50f4>] schedule+0x34/0xa0
> [ 481.191119] [<ffffffff816b72f4>] schedule_timeout+0x204/0x270
> [ 481.191148] [<ffffffffa00cd826>] ? radeon_fence_process+0x16/0x40 [radeon]
> [ 481.191157] [<ffffffffa00cd894>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon]
> [ 481.191165] [<ffffffffa00cdb07>] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon]
> [ 481.191167] [<ffffffff810ac310>] ? prepare_to_wait_event+0x110/0x110
> [ 481.191175] [<ffffffffa00cdf67>] radeon_fence_wait_any+0x57/0x70 [radeon]
> [ 481.191191] [<ffffffffa01432af>] radeon_sa_bo_new+0x2cf/0x4e0 [radeon]
> [ 481.191194] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20
> [ 481.191207] [<ffffffffa019d3e7>] radeon_ib_get+0x37/0xf0 [radeon]
> [ 481.191218] [<ffffffffa00e997d>] radeon_cs_ioctl+0x22d/0x820 [radeon]
> [ 481.191219] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20
> [ 481.191228] [<ffffffffa001bc04>] drm_ioctl+0x1a4/0x630 [drm]
> [ 481.191231] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20
> [ 481.191234] [<ffffffff8106e8da>] ? unpin_current_cpu+0x1a/0x70
> [ 481.191237] [<ffffffff81097440>] ? migrate_enable+0xb0/0x1b0
> [ 481.191243] [<ffffffffa00b004b>] radeon_drm_ioctl+0x4b/0x80 [radeon]
> [ 481.191245] [<ffffffff811c7040>] do_vfs_ioctl+0x2e0/0x4d0
> [ 481.191247] [<ffffffff811d1aa2>] ? __fget+0x72/0xa0
> [ 481.191248] [<ffffffff811c72b1>] SyS_ioctl+0x81/0xa0
> [ 481.191250] [<ffffffff816b8cb2>] tracesys_phase2+0xd4/0xd9
>
>
> Rack #0/Slot #8 [AMD/ATI] Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X]:
>
> [19579.220958] INFO: task Xorg.bin:16569 blocked for more than 120 seconds.
> [19579.228008] Not tainted 3.18.9-rt4 #25
> [19579.232491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [19579.240719] Xorg.bin D ffffffff81716c70 0 16569 16215 0x10400080
> [19579.248076] ffff8805f78bf818 0000000000000002 ffff8805f78bf7f8 0000000000000002
> [19579.248077] 000000000000dc08 ffff880626a0dc08 000000000000dbe8 000000000000dc08
> [19579.248078] ffffffff81c1b500 ffff880606c614a0 ffff880614f7c000 ffff880606c614a0
> [19579.271393] Call Trace:
> [19579.273964] [<ffffffff81713da4>] schedule+0x34/0xa0
> [19579.273965] [<ffffffff817162dc>] schedule_timeout+0x1fc/0x280
> [19579.273990] [<ffffffffa00c7aa6>] ? radeon_fence_process+0x16/0x40 [radeon]
> [19579.273999] [<ffffffffa00c7b14>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon]
> [19579.274008] [<ffffffffa00c7e47>] radeon_fence_wait_seq_timeout.constprop.8+0x2e7/0x340 [radeon]
> [19579.274011] [<ffffffff810cf310>] ? __wake_up_sync+0x20/0x20
> [19579.274020] [<ffffffffa00c8237>] radeon_fence_wait_any+0x57/0x70 [radeon]
> [19579.274035] [<ffffffffa013e2cf>] radeon_sa_bo_new+0x2af/0x4b0 [radeon]
> [19579.274049] [<ffffffffa0196077>] radeon_ib_get+0x37/0xe0 [radeon]
> [19579.274062] [<ffffffffa0194bbc>] radeon_vm_update_page_directory+0x6c/0x290 [radeon]
> [19579.274078] [<ffffffffa0144916>] ? si_ib_parse+0x396/0x430 [radeon]
> [19579.274089] [<ffffffffa00e44ab>] radeon_cs_ioctl+0x35b/0x850 [radeon]
> [19579.274098] [<ffffffffa0005bc7>] drm_ioctl+0x197/0x670 [drm]
> [19579.274102] [<ffffffff81373337>] ? debug_smp_processor_id+0x17/0x20
> [19579.274103] [<ffffffff8108ec2a>] ? unpin_current_cpu+0x1a/0x80
> [19579.274105] [<ffffffff810b85c4>] ? migrate_enable+0x84/0x160
> [19579.274111] [<ffffffffa00aa04c>] radeon_drm_ioctl+0x4c/0x80 [radeon]
> [19579.274114] [<ffffffff811f8ae8>] do_vfs_ioctl+0x2c8/0x4c0
> [19579.274116] [<ffffffff81203902>] ? __fget+0x72/0xb0
> [19579.274117] [<ffffffff811f8d61>] SyS_ioctl+0x81/0xa0
> [19579.274118] [<ffffffff817179de>] tracesys_phase2+0xd4/0xd9
>
>
> Rack #4/Slot #1 Chipset: "KAVERI" (ChipID = 0x130c):
>
> [21721.088164] INFO: task Xorg:7436 blocked for more than 120 seconds.
> [21721.100625] Not tainted 3.18.9-rt4 #26
> [21721.109150] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [21721.124795] Xorg D ffffffff816b7f88 0 7436 7430 0x10400004
> [21721.138897] ffff880409f278e8 0000000000000002 ffff88041e90c460 000000000000c5c8
> [21721.138898] ffff88041e90c5c8 0000000000000006 000000000000c5a8 000000000000c5c8
> [21721.138899] ffff8804177299c0 ffff880409f299c0 000000000000000c ffff880409f299c0
> [21721.183222] Call Trace:
> [21721.188110] [<ffffffff816b50f4>] schedule+0x34/0xa0
> [21721.188112] [<ffffffff816b72f4>] schedule_timeout+0x204/0x270
> [21721.188143] [<ffffffffa00cd826>] ? radeon_fence_process+0x16/0x40 [radeon]
> [21721.188153] [<ffffffffa00cd894>] ? radeon_fence_any_seq_signaled+0x44/0x90 [radeon]
> [21721.188163] [<ffffffffa00cdb07>] radeon_fence_wait_seq_timeout.constprop.7+0x227/0x330 [radeon]
> [21721.188165] [<ffffffff810ac310>] ? prepare_to_wait_event+0x110/0x110
> [21721.188176] [<ffffffffa00cdf67>] radeon_fence_wait_any+0x57/0x70 [radeon]
> [21721.188193] [<ffffffffa01432af>] radeon_sa_bo_new+0x2cf/0x4e0 [radeon]
> [21721.188196] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20
> [21721.188210] [<ffffffffa019d3e7>] radeon_ib_get+0x37/0xf0 [radeon]
> [21721.188223] [<ffffffffa00e997d>] radeon_cs_ioctl+0x22d/0x820 [radeon]
> [21721.188233] [<ffffffffa001bc04>] drm_ioctl+0x1a4/0x630 [drm]
> [21721.188236] [<ffffffff8133c2a7>] ? debug_smp_processor_id+0x17/0x20
> [21721.188238] [<ffffffff8106e8da>] ? unpin_current_cpu+0x1a/0x70
> [21721.188240] [<ffffffff81097440>] ? migrate_enable+0xb0/0x1b0
> [21721.188248] [<ffffffffa00b004b>] radeon_drm_ioctl+0x4b/0x80 [radeon]
> [21721.188250] [<ffffffff811c7040>] do_vfs_ioctl+0x2e0/0x4d0
> [21721.188252] [<ffffffff811d1aa2>] ? __fget+0x72/0xa0
> [21721.188254] [<ffffffff811c72b1>] SyS_ioctl+0x81/0xa0
> [21721.188255] [<ffffffff816b8cb2>] tracesys_phase2+0xd4/0xd9
>
>
> Rack #c/Slot #5 Chipsed: "ATI Radeon HD 5800 Series" (ChipID = 0x6898)
>
> [19711.965733] INFO: task kworker/u24:13:197 blocked for more than 120 seconds.
> [19711.965737] Not tainted 3.18.9-rt4 #26
> [19711.965749] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [19711.965751] kworker/u24:13 D ffff88032901a560 0 197 2 0x10000000
> [19711.965784] Workqueue: radeon-crtc radeon_flip_work_func [radeon]
> [19711.965788] ffff880328b3bc58 0000000000000002 000000000001d65e 0000000000000000
> [19711.965789] ffff880328b3bfd8 000000000008a5c0 ffff880328b3bc78 ffffffffa0482589
> [19711.965791] ffff88032fa81920 ffff880328b30000 ffff88032c63d5f0 ffff880328b30000
> [19711.965794] Call Trace:
> [19711.965813] [<ffffffffa0482589>] ? radeon_fence_activity+0x160/0x172 [radeon]
> [19711.965818] [<ffffffff814e0d38>] schedule+0x7e/0x90
> [19711.965820] [<ffffffff814e2143>] schedule_timeout+0x25/0xd3
> [19711.965835] [<ffffffffa0482ba3>] ? radeon_fence_any_seq_signaled+0x52/0x69 [radeon]
> [19711.965850] [<ffffffffa0482d8d>] radeon_fence_wait_seq_timeout.constprop.6+0x1d3/0x2be [radeon]
> [19711.965853] [<ffffffff81066166>] ? __wake_up_sync+0x12/0x12
> [19711.965869] [<ffffffffa04830e1>] radeon_fence_wait+0x92/0xaa [radeon]
> [19711.965886] [<ffffffffa048dae1>] radeon_flip_work_func+0x11e/0x14f [radeon]
> [19711.965889] [<ffffffff8104cac1>] process_one_work+0x16e/0x2ae
> [19711.965891] [<ffffffff8104d0fe>] worker_thread+0x1df/0x2ca
> [19711.965892] [<ffffffff8104cf1f>] ? cancel_delayed_work+0x91/0x91
> [19711.965894] [<ffffffff8104cf1f>] ? cancel_delayed_work+0x91/0x91
> [19711.965895] [<ffffffff81051324>] kthread+0xae/0xb6
> [19711.965897] [<ffffffff81051276>] ? __kthread_parkme+0x61/0x61
> [19711.965899] [<ffffffff814e322c>] ret_from_fork+0x7c/0xb0
> [19711.965901] [<ffffffff81051276>] ? __kthread_parkme+0x61/0x61
> [19711.965916] INFO: task compiz:2626 blocked for more than 120 seconds.
> [19711.965929] Not tainted 3.18.9-rt4 #26
> [19711.965931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [19711.965932] compiz D ffff88032901a560 0 2626 2186 0x30020000
> [19711.965937] ffff8800b8ee7bc8 0000000000200002 ffff88032bb9e480 0000000000000000
> [19711.965942] ffff8800b8ee7fd8 000000000008a5c0 0000000000000000 ffff8800b8ee7ee0
> [19711.965951] ffffffff81a25450 ffff88032bb9e480 ffff8800b8ee7c28 ffff88032bb9e480
> [19711.965954] Call Trace:
> [19711.965958] [<ffffffff814e0d38>] schedule+0x7e/0x90
> [19711.965959] [<ffffffff814e1ab7>] __rt_mutex_slowlock+0x9f/0xdc
> [19711.965961] [<ffffffff814e1f7b>] rt_mutex_slowlock+0x123/0x236
> [19711.965964] [<ffffffff8106b234>] rt_mutex_fastlock.constprop.24+0x2e/0x30
> [19711.965965] [<ffffffff814e2103>] rt_mutex_lock+0x13/0x15
> [19711.965967] [<ffffffff8106b613>] __rt_down_read.isra.1+0x29/0x30
> [19711.965968] [<ffffffff8106b628>] rt_down_read+0xe/0x10
> [19711.965988] [<ffffffffa04942ff>] radeon_gem_create_ioctl+0x2c/0xc6 [radeon]
> [19711.965990] [<ffffffff812004f9>] ? avc_has_perm_noaudit+0xf7/0x109
> [19711.966004] [<ffffffffa010bc26>] drm_ioctl+0x380/0x3f8 [drm]
> [19711.966025] [<ffffffffa04942d3>] ? radeon_gem_pwrite_ioctl+0x28/0x28 [radeon]
> [19711.966027] [<ffffffff81200ca6>] ? inode_has_perm+0x2f/0x34
> [19711.966029] [<ffffffff81200e58>] ? file_has_perm+0x5d/0x81
> [19711.966040] [<ffffffffa046e00e>] radeon_drm_ioctl+0xe/0x10 [radeon]
> [19711.966067] [<ffffffffa0518b9c>] radeon_kms_compat_ioctl+0x1b/0x1f [radeon]
> [19711.966070] [<ffffffff8115e692>] compat_SyS_ioctl+0x1c3/0xf6e
> [19711.966072] [<ffffffff8100e7b1>] ? syscall_trace_enter+0x52/0x57
> [19711.966074] [<ffffffff814e5679>] ia32_do_call+0x13/0x13
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the dri-devel
mailing list