RIP: 0010:radeon_vm_fini+0x15/0x220 [radeon]

Christian König christian.koenig at amd.com
Mon Jan 17 09:11:22 UTC 2022



Am 17.01.22 um 09:42 schrieb Jan Stancek:
> On Mon, Jan 17, 2022 at 08:16:09AM +0100, Christian König wrote:
>> Hi Borislav,
>>
>> Am 15.01.22 um 17:11 schrieb Borislav Petkov:
>>> Hi folks,
>>>
>>> so this is a *very* old K8 laptop - yap, you read it right, family 0xf.
>>>
>>> [   31.353032] powernow_k8: fid 0xa (1800 MHz), vid 0xa
>>> [   31.353569] powernow_k8: fid 0x8 (1600 MHz), vid 0xc
>>> [   31.354081] powernow_k8: fid 0x0 (800 MHz), vid 0x16
>>> [   31.354844] powernow_k8: Found 1 AMD Turion(tm) 64 Mobile 
>>> Technology MT-34 (1 cpu cores) (version 2.20.00)
>>>
>>> This is true story.
>>
>> well, that hardware is ancient ^^.
>>
>> Interesting to see that even that old stuff is still used.
>>
>>> Anyway, it blows up, see below.
>>>
>>> Kernel is latest Linus tree, top commit is:
>>>
>>> a33f5c380c4b ("Merge tag 'xfs-5.17-merge-3' of 
>>> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux")
>>>
>>> I can bisect if you don't see it immediately why it blows up.
>>
>> Immediately I see that code is called which isn't for this hardware 
>> generation.
>>
>> This is extremely odd because it means that we either have recently 
>> added a logic bug or the detection of the hardware generation doesn't 
>> work as expected any more.
>>
>> Please bisect,
>> Christian.
>
> I'm see panics like this one as well on multiple systems in lab (e.g. 
> ProLiant SL390s G7,
> PowerEdge R805). Looks same to what Bruno reported here:
>  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FCA%2BQYu4rt2VHWzbOt-SegA9yABqC-D36PoqTZmy6DscWvp%2B6ZMQ%40mail.gmail.com%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C42f29e6eb93243584c2108d9d9953e25%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637780057291895847%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=HO5dYKo7kQHtneS%2F5ftl9KobWa%2BIjgXKjf7SXe0aRcw%3D&reserved=0 
>
>
> It started around 8d0749b4f83b - Merge tag 'drm-next-2022-01-07', 
> running a bisect atm.

Not necessary any more. That is probably caused by commit 
drm/radeon/radeon_kms: Fix a NULL pointer dereference in 
radeon_driver_open_kms() ab50cb9df8896b39aae65c537a30de2c79c19735.

I'm getting other bug reports for that one as well. Going to take a look.

Regards,
Christian.

>
> [   15.230105] SGI XFS with ACLs, security attributes, scrub, quota, 
> no debug enabled [   15.234816] XFS (sdb1): Mounting V5 Filesystem [   
> 15.342261] [drm] ib test succeeded in 0 usecs [ 15.343311] [drm] No TV 
> DAC info found in BIOS [   15.344061] [drm] Radeon Display Connectors 
> [   15.344330] [drm] Connector 0: [ 15.344961] [drm]   VGA-1 [   
> 15.345174] [drm]   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60 [   
> 15.345991] [drm]   Encoders: [ 15.346617] [drm]     CRT1: 
> INTERNAL_DAC1 [   15.346942] [drm] Connector 1: [   15.347561] [drm]   
> VGA-2 [   15.347746] [drm] DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 
> 0x6c [   15.348598] [drm]   Encoders: [   15.349217] [drm]     CRT2: 
> INTERNAL_DAC2 [ 15.349521] BUG: kernel NULL pointer dereference, 
> address: 0000000000000000 [   15.349974] #PF: supervisor read access 
> in kernel mode [   15.350305] #PF: error_code(0x0000) - not-present 
> page [   15.350675] PGD 0 P4D 0  [   15.350814] Oops: 0000 [#[ 
> 15.431048] CPU: 0 PID: 410 Comm: systemd-udevd Tainted: G I       
> 5.16.0 #1 [   15.443401] XFS (sdb1): Ending clean mount [   15.451541] 
> Hardware name: HP ProLiant SL390s G7/, BIOS P69 07/02/2013 [   
> 15.451545] RIP: 0010:radeon_vm_fini+0x174/0x300 [radeon] [   
> 15.452689] Code: e8 74 cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b 
> 5c 24 48 49 39 de 74 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b 
> a8 <48> 8b 2b 48 8d 7b 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3 
> [   15.454412] RSP: 0018:ffffa3494800001 R08: 0000000000200000 R09: 
> 0000000000000000 [   15.533944] R10: 0000000000000000 R11: 
> ffffffffc04f7810 R12: ffff979b4ba46730 [   15.533945] R13: 
> ffff979d5c260000 R14: ffff979b4ba46778 R15: ffffffffffffffa8 [   
> 15.533947] FS: 00007f3a13141500(0000) GS:ffff979d4ba00000(0000) 
> knlGS:0000000000000000 [   15.533948] CS:  0010 DS: 0000 ES: 0000 CR0: 
> 0000000080050033 [   15.533950] CR2: 0000000000000000 CR3: 
> 000000031c7fc005 CR4: 00000000000206f0 [   15.533952] Call Trace: [   
> 15.533956]  <TASK> [   15.533959] radeon_driver_open_kms+0x118/0x180 
> [radeon] [   15.533998] drm_file_alloc+0x1a8/0x230 [drm] [       OK   
> [[   15.961755] drm_client_init+0x99/0x130 [drm]  [   15.961777] 
> drm_fb_helper_init+0x32/0x50 [drm_kms_helper]  [   15.961809] 
> radeon_fbdev_init+0xbc/0x110 [radeon]  [   15.963653] 
> radeon_modeset_init+0x857/0x9e0 [radeon]  0m] Mounted  [0;[ 
> 15.964003]  radeon_driver_load_kms+0x19b/0x290 [radeon]  [ 15.964474]  
> drm_dev_register+0xf5/0x2d0 [drm]  1;39msysroot.mou[ 15.965196]  
> radeon_pci_probe+0xc3/0x120 [radeon]  [   15.965972] 
> pci_device_probe+0x185/0x220  [   15.966225] 
> call_driver_probe+0x32/0xd0  [   15.966505] really_probe+0x157/0x380 
>  [   15.99bus_add_driver+0x111/0x210  [ 16.467150]  ? 
> 0xffffffffc0412000  [   16.467805] driver_register+0x81/0x120  [   
> 16.468069] do_one_initcall+0xb0/0x290  [   16.468359]  ? 
> down_write+0xe/0x40  [   16.469008]  ? kernfs_activate+0x28/0x130  [   
> 16.469267]  ? kernfs_add_one+0x1c8/0x210  [   16.469563]  ? 
> vunmap_p4d_range+0x3dc/0x420  [   16.469858]  ? __vunmap+0x1df/0x2a0 
>  [   16.470466]  ? kmem_cache_alloc_trace+0x1a4/0x330  [   16.471224]  
> ? do_init_module+0x24/0x230  [   16.471485] do_init_module+0x5a/0x230 
>  [   16.471779] load_module+0x145f/0x1630  [   16.472022]  ? 
> kernel_read_file_from_fd+0x5d/0x80  [   16.472762] 
> __se_sys_finit_module+0x9f/0xd0  [   16.473480] 
> do_syscall_64+0x43/0x90  [   16.473778] 
> entry_SYSCALL_64_after_hwframe+0x44/0xae  [   16.474123] RIP: 
> 0033:0x7f3a13d11e2d  [   16.474422] Code: 5b 41 5c c3 66 0f 1f 84 00 
> 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 
> 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bb 
> 7f 0e 00 f7 d8 64 89 01 48  [   16.476010] RSP: 002b:00007fff9cb92b78 
> EFLAGS: 00000246 ORIG_RAX: 000000 R08: 0000000000000000 R09: 
> 0000000000000002  [   16.977414] R10: 0000000000000012 R11: 
> 0000000000000246 R12: 00007f3a13e6d43c  [ 16.978320] R13: 
> 0000555c5eba3080 R14: 0000000000000007 R15: 0000555c5eba3d70  [   
> 16.979218]  </TASK>  [   16.979381] Modules linked in: xfs radeon(+) 
> drm_ttm_helper ttm i2c_algo_bit drm_kms_helper crct10dif_pclmul 
> crc32_pclmul crc32c_intel cec ata_generic ghash_clmulni_intel drm 
> serio_raw pata_acpi hpwdt  [ 16.980516] CR2: 0000000000000000  [   
> 16.981179] ---[ end trace d6f7f573dad76bd2 ]---  [   16.981861] RIP: 
> 0010:radeon_vm_fini+0x174/0x300 [radeon]  [   16.982257] Code: e8 74 
> cc 7a c1 eb d1 4c 8b 24 24 4d 8d 74 24 48 49 8b 5c 24 48 49 39 de 74 
> 38 66 2e 0f 1f 84 00 00 00 00 00 66 90 4c 8d 7b a8 <48> 8b 2b 48 8d 7b 
> 18 e8 30 1e f4 ff 48 83 c3 c0 48 89 df e8 34 f3  [   16.983766] RSP: 
> 0018:ffffa3494801f8e8 EFLAGS: 00010286  [   16.984124] RAX: 
> 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000  nt     - 
> /sysroo[ 16.984981] RDX: 0000000000000001 RSI: ffff979b4ba46730 RDI: 
> ffff979b4ba46750   [   16.985898] RBP: 0000000000000001 R08: 
> 0000000000200000 R09: 0000000000000000   [   16.986730] R10: 
> 0000000000000000 R11: ffffffffc04f7810 R12: 0 ES: 0000 CR0: 
> 0000000080050033   [   17.488057] CR2: 0000000000000000 CR3: 
> 000000031c7fc005 CR4: 00000000000206f0   [   17.489013] Kernel panic - 
> not syncing: Fatal exception   [   17.489404] Kernel Offset: 0x0 from 
> 0xffffffff81000000 (relocation range: 
> 0xffffffff80000000-0xffffffffbfffffff)   [   17.490485] ---[ end 
> Kernel panic - not syncing: Fatal exception ]---
>



More information about the amd-gfx mailing list