[Bug 106942] X freezes with Ubuntu kernel 4.15.0-23-generic (AMDGPU)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Jun 17 14:56:22 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=106942

            Bug ID: 106942
           Summary: X freezes with Ubuntu kernel 4.15.0-23-generic
                    (AMDGPU)
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
               URL: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1
                    777245
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: rhialto at falu.nl

This is on my parents machine; I only have remote access and not continuously,
so this may slow down providing additional information.

Since the latest kernel upgrade (from
 Linux version 4.13.0-43-generic (buildd at lgw01-amd64-026) (gcc version 7.2.0
(Ubuntu 7.2.0-8ubuntu3.2)) #48-Ubuntu SMP Wed May 16 12:18:48 UTC 2018 (Ubuntu
4.13.0-43.48-generic 4.13.16)
to
Linux version 4.15.0-23-generic (buildd at lgw01-amd64-055) (gcc version 7.3.0
(Ubuntu 7.3.0-16ubuntu3)) #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 (Ubuntu
4.15.0-23.25-generic 4.15.18)
the machine appears to freezes some time soon after booting. It is just X; the
machine is still reachable via ssh.

I filed this bug [on Ubuntu Launchpad) using apport-cli, running the older
(working) kernel, not the newer (failing) one.

In /var/log/kernel I can see the following:

Jun 15 23:26:23 xa-xubu kernel: [ 2417.562386] INFO: task Xorg:757 blocked for
more than 120 seconds.
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562396] Not tainted 4.15.0-23-generic
#25-Ubuntu
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562399] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562403] Xorg D 0 757 724 0x00400004
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562408] Call Trace:
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562424] __schedule+0x297/0x8b0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562430] ? __kfifo_in+0x37/0x50
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562434] schedule+0x2c/0x80
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562559]
amd_sched_entity_push_job+0xad/0xf0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562565] ? wait_woken+0x80/0x80
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562653] amdgpu_job_submit+0x9f/0xc0
[amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562723]
amdgpu_vm_bo_update_mapping+0x389/0x3f0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562793] ?
amdgpu_vm_it_iter_first+0x40/0x40 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562863] amdgpu_vm_bo_update+0x325/0x5b0
[amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562930] amdgpu_gem_va_ioctl+0x524/0x540
[amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.562962] ?
drm_gem_handle_create_tail+0x120/0x190 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563028] ?
amdgpu_gem_create_ioctl+0xc1/0x270 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563096] ?
amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563115] drm_ioctl_kernel+0x5f/0xb0 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563134] ? drm_ioctl_kernel+0x5f/0xb0
[drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563154] drm_ioctl+0x31b/0x3d0 [drm]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563220] ?
amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563225] ? update_load_avg+0x57f/0x6e0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563231] ? futex_wake+0x8f/0x180
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563290] amdgpu_drm_ioctl+0x4f/0x90
[amdgpu]
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563296] do_vfs_ioctl+0xa8/0x630
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563300] ? __schedule+0x29f/0x8b0
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563304] SyS_ioctl+0x79/0x90
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563309] do_syscall_64+0x73/0x130
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563313]
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563317] RIP: 0033:0x7fbd7ddcf5d7
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563319] RSP: 002b:00007fff67e69aa8
EFLAGS: 00003246 ORIG_RAX: 0000000000000010
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563322] RAX: ffffffffffffffda RBX:
0000000000020000 RCX: 00007fbd7ddcf5d7
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563324] RDX: 00007fff67e69af0 RSI:
00000000c0286448 RDI: 000000000000000e
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563326] RBP: 00007fff67e69af0 R08:
0000000101440000 R09: 000000000000000a
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563328] R10: 0000000000000039 R11:
0000000000003246 R12: 00000000c0286448
Jun 15 23:26:23 xa-xubu kernel: [ 2417.563330] R13: 000000000000000e R14:
000055e820965f20 R15: 0

which seems to point to some in-kernel AMD GPU driver.
Since the problem seems to disappear when switching back to the previous
kernel, I filed this as a kernel bug.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-23-generic 4.15.0-23.25
ProcVersionSignature: Ubuntu 4.13.0-43.48-generic 4.13.16
Uname: Linux 4.13.0-43-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: rhialto 21222 F.... pulseaudio
 /dev/snd/controlC0: rhialto 21222 F.... pulseaudio
Date: Sat Jun 16 16:03:46 2018
InstallationDate: Installed on 2017-10-29 (230 days ago)
InstallationMedia: Xubuntu 17.10 "Artful Aardvark" - Release amd64 (20171017.1)
IwConfig:
 enp1s0 no wireless extensions.

 lo no wireless extensions.
MachineType: LENOVO 90G9001RNY
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.13.0-43-generic.efi.signed
root=UUID=103b739d-56cf-440b-a2b4-fc955e1a0a41 ro quiet splash vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-4.13.0-43-generic N/A
 linux-backports-modules-4.13.0-43-generic N/A
 linux-firmware 1.173.1
RfKill:
 0: hci0: Bluetooth
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-06-09 (7 days ago)
dmi.bios.date: 12/29/2016
dmi.bios.vendor: LENOVO
dmi.bios.version: O2HKT24A
dmi.board.name: Jadeite CRB
dmi.board.vendor: LENOVO
dmi.board.version: SDK0J40700 WIN 3258076524150
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias:
dmi:bvnLENOVO:bvrO2HKT24A:bd12/29/2016:svnLENOVO:pn90G9001RNY:pvrideacentre310S-08ASR:rvnLENOVO:rnJadeiteCRB:rvrSDK0J40700WIN3258076524150:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: ideacentre 310S-08ASR
dmi.product.name: 90G9001RNY
dmi.product.version: ideacentre 310S-08ASR
dmi.sys.vendor: LENOVO

Part from lspci, to show the graphics hardware and iommu:

00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Device [1022:1577]
        Subsystem: Lenovo Device [17aa:364f]
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 24
        Capabilities: <access denied>

00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Device [1002:98e4] (rev c8) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device [17aa:364f]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort+ >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 225
        Region 0: Memory at e8000000 (64-bit, prefetchable) [size=128M]
        Region 2: Memory at f0000000 (64-bit, prefetchable) [size=8M]
        Region 4: I/O ports at f000 [size=256]
        Region 5: Memory at feb00000 (32-bit, non-prefetchable) [size=256K]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: amdgpu
        Kernel modules: amdgpu


For more detailed info, such as the full output from lspci, see the attachments
at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1777245
I filed the bug there first (because the bug occurred after a kernel update)
but I am echoing it here since this may be a more targeted audience.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180617/45597d40/attachment.html>


More information about the dri-devel mailing list