amd-staging-drm-next: Oops - BUG: unable to handle kernel NULL pointer dereference, bisected.

Koenig, Christian Christian.Koenig at amd.com
Wed Jan 30 12:42:33 UTC 2019


Does the attached patch fix the issue?

Christian.

Am 30.01.19 um 13:06 schrieb Christian König:
Sorry I accidentally replied to the wrong mail.

This is a new issue. Going to take a look now.

Christian.

Am 30.01.19 um 13:02 schrieb Christian König:
This is a known issue, see here as well https://bugs.freedesktop.org/show_bug.cgi?id=109487

Christian.

Am 30.01.19 um 12:07 schrieb Przemek Socha:

Good morning,

after last pull from the amd-staging-drm-next tree (29th of February) I have
random Oops on A6 6310 APU with r4 Mullins.

Here is the Oops part of the log taken from pstore:

<1>[   55.166270] BUG: unable to handle kernel NULL pointer dereference at
0000000000000208
<1>[   55.166281] #PF error: [normal kernel read fault]
<6>[   55.166285] PGD 0 P4D 0
<4>[   55.166293] Oops: 0000 [#1] PREEMPT SMP
<4>[   55.166301] CPU: 3 PID: 11006 Comm: kwin_x11:cs0 Not tainted 5.0.0-rc1+
#44
<4>[   55.166305] Hardware name: LENOVO 80E3/Lancer 5B2, BIOS A2CN45WW(V2.13)
08/04/2016
<4>[   55.166320] RIP: 0010:ttm_bo_bulk_move_lru_tail+0xd3/0x188 [ttm]
<4>[   55.166326] Code: 00 4c 8b 0a 48 8b 81 a8 00 00 00 48 81 c1 a8 00 00 00
49 89 02 4c 8b 92 b0 00 00 00 4c 89 50 08 44 89 c0 48 c1 e0 04 4c 01 c8 <4c>
8b 90 08 02 00 00 4d 89 1a 4c 8b 90 08 02 00 00 4c 89 92 b0 00
<4>[   55.166330] RSP: 0018:ffffa8bdc0f33b18 EFLAGS: 00010246
<4>[   55.166335] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
ffff9cfa935778f8
<4>[   55.166339] RDX: ffff9cfa950c5050 RSI: 0000000000000070 RDI:
ffff9cfa93575dd0
<4>[   55.166342] RBP: ffff9cfa5d44d800 R08: 0000000000000000 R09:
0000000000000000
<4>[   55.166346] R10: ffff9cfa8f7730f8 R11: ffff9cfa950c50f8 R12: ffff9cfa93575dd0
<4>[   55.166350] R13: ffff9cfa93575800 R14: 0000000000000001 R15: ffffffffc03adc10
<4>[   55.166355] FS:  00007fb327fff700(0000) GS:ffff9cfa97b80000(0000) knlGS:
0000000000000000
<4>[   55.166359] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   55.166363] CR2: 0000000000000208 CR3: 00000002150f0000 CR4:
00000000000406e0
<4>[   55.166366] Call Trace:
<4>[   55.166477]  amdgpu_vm_move_to_lru_tail+0xe4/0x100 [amdgpu]
<4>[   55.166563]  amdgpu_cs_ioctl+0x14e7/0x1b08 [amdgpu]
<4>[   55.166586]  ? __switch_to_asm+0x40/0x70
<4>[   55.166689]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
<4>[   55.166698]  drm_ioctl_kernel+0xa4/0xe8
<4>[   55.166707]  drm_ioctl+0x1db/0x358
<4>[   55.166805]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
<4>[   55.166901]  amdgpu_drm_ioctl+0x44/0x78 [amdgpu]
<4>[   55.166931]  do_vfs_ioctl+0x9f/0x618
<4>[   55.166940]  ksys_ioctl+0x5b/0x88
<4>[   55.166947]  __x64_sys_ioctl+0x11/0x18
<4>[   55.166955]  do_syscall_64+0x50/0x168
<4>[   55.166963]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4>[   55.166969] RIP: 0033:0x7fb34b035fa7
<4>[   55.166974] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 8d
dc 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d a9 ae 0c 00 f7 d8 64 89 01 48
<4>[   55.166978] RSP: 002b:00007fb327ffea88 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
<4>[   55.166984] RAX: ffffffffffffffda RBX: 00007fb327ffec58 RCX: 00007fb34b035fa7
<4>[   55.166987] RDX: 00007fb327ffeb10 RSI: 00000000c0186444 RDI:
0000000000000010
<4>[   55.166991] RBP: 00007fb327ffeb10 R08: 00007fb327ffec80 R09:
00007fb327ffec58
<4>[   55.166995] R10: 00007fb327ffeca0 R11: 0000000000000246 R12:
00000000c0186444
<4>[   55.166998] R13: 0000000000000010 R14: 000055ecd2705dc0 R15:
0000000000000003
<4>[   55.167004] Modules linked in: rfcomm nf_tables ebtable_nat ip_set
nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables overlay squashfs
loop bnep ipv6 rtsx_usb_ms memstick rtsx_usb_sdmmc rtsx_usb uvcvideo
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
media ath3k btusb btintel bluetooth ecdh_generic ath9k ath9k_common kvm_amd
ath9k_hw sdhci_pci kvm cqhci irqbypass mac80211 sdhci crc32_pclmul
ghash_clmulni_intel ath serio_raw mmc_core cfg80211 amdgpu mfd_core chash
gpu_sched xhci_pci ttm xhci_hcd ehci_pci ehci_hcd sp5100_tco
<4>[   55.167063] CR2: 0000000000000208
<4>[   55.167069] ---[ end trace bf1c4be089002236 ]---

Bisected, and  it seems that the bad commit is "drm/amdgpu: cleanup setting
bulk_movable". I hope this is relevant.

full git bisect log:

git bisect start
# good: [10117450735c7a7c0858095fb46a860e7037cb9a] drm/amd/display: add -msse2
to prevent Clang from emitting libcalls to undefined SW FP routines
git bisect good 10117450735c7a7c0858095fb46a860e7037cb9a
# bad: [b9c6252b7f980e7e03c0bf659a251798b36a8094] Revert "drm/amd/display: add
-msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"
git bisect bad b9c6252b7f980e7e03c0bf659a251798b36a8094
# good: [1de29da5b7281c9a8427d84948bf3d77bc4b8d16] drm: disable uncached DMA
optimization for ARM and arm64
git bisect good 1de29da5b7281c9a8427d84948bf3d77bc4b8d16
# good: [bbf48cae572b39c4df6023b01d6f8de66ef41b34] Revert "test patch for hpd
dpms check"
git bisect good bbf48cae572b39c4df6023b01d6f8de66ef41b34
# good: [257b75d373c77d6792d0011f7379398ba60799ec] drm/amdgpu: Show XGMI node
and hive message per device only once
git bisect good 257b75d373c77d6792d0011f7379398ba60799ec
# good: [4d771657c533d8fe3b574c561084f66aebc77bb6] drm/amdgpu: cleanup
amdgpu_pte_update_params
git bisect good 4d771657c533d8fe3b574c561084f66aebc77bb6
# bad: [4ef27005fefd4be102010b7d8552fec1ee13435a] drm/amdgpu: cleanup setting
bulk_movable
git bisect bad 4ef27005fefd4be102010b7d8552fec1ee13435a
# first bad commit: [4ef27005fefd4be102010b7d8552fec1ee13435a] drm/amdgpu:
cleanup setting bulk_movable

4ef27005fefd4be102010b7d8552fec1ee13435a is the first bad commit
commit 4ef27005fefd4be102010b7d8552fec1ee13435a
Author: Christian König <christian.koenig at amd.com><mailto:christian.koenig at amd.com>
Date:   Mon Jan 28 13:41:58 2019 +0100

    drm/amdgpu: cleanup setting bulk_movable

    We only need to set this to false now when BOs are removed from the LRU.

    Signed-off-by: Christian König <christian.koenig at amd.com><mailto:christian.koenig at amd.com>
    Reviewed-by: Chunming Zhou <david1.zhou at amd.com><mailto:david1.zhou at amd.com>

If other info is needed, please do not hesitate.

Thanks,
Przemek.




_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190130/27d5eabd/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-amdgpu-partial-revert-cleanup-setting-bulk_movab.patch
Type: text/x-patch
Size: 1077 bytes
Desc: 0001-drm-amdgpu-partial-revert-cleanup-setting-bulk_movab.patch
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190130/27d5eabd/attachment-0001.bin>


More information about the amd-gfx mailing list