[PATCH 5/5] drm/amd/sched: signal and free remaining fences in amd_sched_entity_fini

Michel Dänzer michel at daenzer.net
Thu Oct 12 16:49:35 UTC 2017


On 12/10/17 01:00 PM, Michel Dänzer wrote:
> 
> [0] I also got this, but I don't know yet if it's related:

No, that seems to be a separate issue; I can still reproduce it with the
huge page related changes reverted. Unfortunately, it doesn't seem to
happen reliably on every piglit run.

Even before your changes this morning, there's another hang which
doesn't happen every time, without any corresponding dmesg output.

Lots of "fun" in amd-staging-drm-next...


>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000220
>  IP: amdgpu_vm_bo_invalidate+0x88/0x210 [amdgpu]
>  PGD 0 
>  P4D 0 
>  
>  Oops: 0000 [#1] SMP
>  Modules linked in: cpufreq_powersave cpufreq_userspace cpufreq_conservative amdkfd(O) edac_mce_amd kvm amdgpu(O) irqbypass crct10dif_pclmul crc32_pclmul chash snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic snd_hda_codec_hdmi pcbc binfmt_misc ttm(O) efi_pstore snd_hda_intel drm_kms_helper(O) snd_hda_codec nls_ascii drm(O) snd_hda_core nls_cp437 i2c_algo_bit aesni_intel snd_hwdep fb_sys_fops aes_x86_64 crypto_simd vfat syscopyarea glue_helper sysfillrect snd_pcm fat sysimgblt sp5100_tco wmi_bmof ppdev r8169 snd_timer cryptd pcspkr efivars mfd_core mii ccp i2c_piix4 snd soundcore rng_core sg wmi parport_pc parport i2c_designware_platform i2c_designware_core button acpi_cpufreq tcp_bbr sch_fq sunrpc nct6775 hwmon_vid efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache
>   jbd2 fscrypto raid10 raid1 raid0 multipath linear md_mod dm_mod sd_mod evdev hid_generic usbhid hid crc32c_intel ahci libahci xhci_pci libata xhci_hcd scsi_mod usbcore shpchp gpio_amdpt gpio_generic
>  CPU: 13 PID: 1075 Comm: max-texture-siz Tainted: G        W  O    4.13.0-rc5+ #28
>  Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.80 09/13/2017
>  task: ffff9d2982c75a00 task.stack: ffffb2744e9bc000
>  RIP: 0010:amdgpu_vm_bo_invalidate+0x88/0x210 [amdgpu]
>  RSP: 0018:ffffb2744e9bf6e8 EFLAGS: 00010202
>  RAX: 0000000000000000 RBX: ffff9d2848642820 RCX: ffff9d28c77fdae0
>  RDX: 0000000000000001 RSI: ffff9d28c77fd800 RDI: ffff9d288f286008
>  RBP: ffffb2744e9bf728 R08: 000000ffffffffff R09: 0000000000000000
>  R10: 0000000000000078 R11: ffff9d298ba170a0 R12: ffff9d28c77fd800
>  R13: 0000000000000001 R14: ffff9d288f286000 R15: ffff9d2848642800
>  FS:  00007f809fc5c300(0000) GS:ffff9d298e940000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000220 CR3: 000000030e05a000 CR4: 00000000003406e0
>  Call Trace:
>   amdgpu_bo_move_notify+0x42/0xd0 [amdgpu]
>   ttm_bo_unmap_virtual_locked+0x298/0xac0 [ttm]
>   ? ttm_bo_mem_space+0x391/0x580 [ttm]
>   ttm_bo_unmap_virtual_locked+0x737/0xac0 [ttm]
>   ttm_bo_unmap_virtual_locked+0xa6f/0xac0 [ttm]
>   ttm_bo_mem_space+0x306/0x580 [ttm]
>   ttm_bo_validate+0xd4/0x150 [ttm]
>   ttm_bo_init_reserved+0x22e/0x440 [ttm]
>   amdgpu_ttm_placement_from_domain+0x33c/0x580 [amdgpu]
>   ? amdgpu_fill_buffer+0x300/0x420 [amdgpu]
>   amdgpu_bo_create+0x50/0x2b0 [amdgpu]
>   amdgpu_gem_object_create+0x9f/0x110 [amdgpu]
>   amdgpu_gem_create_ioctl+0x12f/0x270 [amdgpu]
>   ? amdgpu_gem_object_close+0x210/0x210 [amdgpu]
>   drm_ioctl_kernel+0x5d/0xf0 [drm]
>   drm_ioctl+0x32a/0x630 [drm]
>   ? amdgpu_gem_object_close+0x210/0x210 [amdgpu]
>   ? lru_cache_add_active_or_unevictable+0x36/0xb0
>   ? __handle_mm_fault+0x90d/0xff0
>   amdgpu_drm_ioctl+0x4f/0x1c20 [amdgpu]
>   do_vfs_ioctl+0xa5/0x600
>   ? handle_mm_fault+0xd8/0x230
>   ? __do_page_fault+0x267/0x4c0
>   SyS_ioctl+0x79/0x90
>   entry_SYSCALL_64_fastpath+0x1e/0xa9
>  RIP: 0033:0x7f809c8f3dc7
>  RSP: 002b:00007ffcc8c485f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>  RAX: ffffffffffffffda RBX: 00007f809cbaab00 RCX: 00007f809c8f3dc7
>  RDX: 00007ffcc8c48640 RSI: 00000000c0206440 RDI: 0000000000000006
>  RBP: 0000000040000010 R08: 00007f809cbaabe8 R09: 0000000000000060
>  R10: 0000000000000004 R11: 0000000000000246 R12: 0000000040001000
>  R13: 00007f809cbaab58 R14: 0000000000001000 R15: 00007f809cbaab00
>  Code: 49 8b 47 10 48 39 45 d0 4c 8d 78 f0 0f 84 87 00 00 00 4d 8b 37 45 84 ed 41 c6 47 30 01 49 8d 5f 20 49 8d 7e 08 74 19 49 8b 46 58 <48> 8b 80 20 02 00 00 49 39 84 24 20 02 00 00 0f 84 ab 00 00 00 
>  RIP: amdgpu_vm_bo_invalidate+0x88/0x210 [amdgpu] RSP: ffffb2744e9bf6e8
>  CR2: 0000000000000220
> 
> 


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the amd-gfx mailing list