[PATCH v6 00/22] Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers

Robin Murphy robin.murphy at arm.com
Tue Jun 28 12:31:02 UTC 2022


On 2022-05-27 00:50, Dmitry Osipenko wrote:
> Hello,
> 
> This patchset introduces memory shrinker for the VirtIO-GPU DRM driver
> and adds memory purging and eviction support to VirtIO-GPU driver.
> 
> The new dma-buf locking convention is introduced here as well.
> 
> During OOM, the shrinker will release BOs that are marked as "not needed"
> by userspace using the new madvise IOCTL, it will also evict idling BOs
> to SWAP. The userspace in this case is the Mesa VirGL driver, it will mark
> the cached BOs as "not needed", allowing kernel driver to release memory
> of the cached shmem BOs on lowmem situations, preventing OOM kills.
> 
> The Panfrost driver is switched to use generic memory shrinker.

I think we still have some outstanding issues here - Alyssa reported 
some weirdness yesterday, so I just tried provoking a low-memory 
condition locally with this series applied and a few debug options 
enabled, and the results as below were... interesting.

Thanks,
Robin.

----->8-----
[   68.295951] ======================================================
[   68.295956] WARNING: possible circular locking dependency detected
[   68.295963] 5.19.0-rc3+ #400 Not tainted
[   68.295972] ------------------------------------------------------
[   68.295977] cc1/295 is trying to acquire lock:
[   68.295986] ffff000008d7f1a0 
(reservation_ww_class_mutex){+.+.}-{3:3}, at: drm_gem_shmem_free+0x7c/0x198
[   68.296036]
[   68.296036] but task is already holding lock:
[   68.296041] ffff80000c14b820 (fs_reclaim){+.+.}-{0:0}, at: 
__alloc_pages_slowpath.constprop.0+0x4d8/0x1470
[   68.296080]
[   68.296080] which lock already depends on the new lock.
[   68.296080]
[   68.296085]
[   68.296085] the existing dependency chain (in reverse order) is:
[   68.296090]
[   68.296090] -> #1 (fs_reclaim){+.+.}-{0:0}:
[   68.296111]        fs_reclaim_acquire+0xb8/0x150
[   68.296130]        dma_resv_lockdep+0x298/0x3fc
[   68.296148]        do_one_initcall+0xe4/0x5f8
[   68.296163]        kernel_init_freeable+0x414/0x49c
[   68.296180]        kernel_init+0x2c/0x148
[   68.296195]        ret_from_fork+0x10/0x20
[   68.296207]
[   68.296207] -> #0 (reservation_ww_class_mutex){+.+.}-{3:3}:
[   68.296229]        __lock_acquire+0x1724/0x2398
[   68.296246]        lock_acquire+0x218/0x5b0
[   68.296260]        __ww_mutex_lock.constprop.0+0x158/0x2378
[   68.296277]        ww_mutex_lock+0x7c/0x4d8
[   68.296291]        drm_gem_shmem_free+0x7c/0x198
[   68.296304]        panfrost_gem_free_object+0x118/0x138
[   68.296318]        drm_gem_object_free+0x40/0x68
[   68.296334]        drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
[   68.296352]        drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
[   68.296368]        do_shrink_slab+0x220/0x808
[   68.296381]        shrink_slab+0x11c/0x408
[   68.296392]        shrink_node+0x6ac/0xb90
[   68.296403]        do_try_to_free_pages+0x1dc/0x8d0
[   68.296416]        try_to_free_pages+0x1ec/0x5b0
[   68.296429]        __alloc_pages_slowpath.constprop.0+0x528/0x1470
[   68.296444]        __alloc_pages+0x4e0/0x5b8
[   68.296455]        __folio_alloc+0x24/0x60
[   68.296467]        vma_alloc_folio+0xb8/0x2f8
[   68.296483]        alloc_zeroed_user_highpage_movable+0x58/0x68
[   68.296498]        __handle_mm_fault+0x918/0x12a8
[   68.296513]        handle_mm_fault+0x130/0x300
[   68.296527]        do_page_fault+0x1d0/0x568
[   68.296539]        do_translation_fault+0xa0/0xb8
[   68.296551]        do_mem_abort+0x68/0xf8
[   68.296562]        el0_da+0x74/0x100
[   68.296572]        el0t_64_sync_handler+0x68/0xc0
[   68.296585]        el0t_64_sync+0x18c/0x190
[   68.296596]
[   68.296596] other info that might help us debug this:
[   68.296596]
[   68.296601]  Possible unsafe locking scenario:
[   68.296601]
[   68.296604]        CPU0                    CPU1
[   68.296608]        ----                    ----
[   68.296612]   lock(fs_reclaim);
[   68.296622] 
lock(reservation_ww_class_mutex);
[   68.296633]                                lock(fs_reclaim);
[   68.296644]   lock(reservation_ww_class_mutex);
[   68.296654]
[   68.296654]  *** DEADLOCK ***
[   68.296654]
[   68.296658] 3 locks held by cc1/295:
[   68.296666]  #0: ffff00000616e898 (&mm->mmap_lock){++++}-{3:3}, at: 
do_page_fault+0x144/0x568
[   68.296702]  #1: ffff80000c14b820 (fs_reclaim){+.+.}-{0:0}, at: 
__alloc_pages_slowpath.constprop.0+0x4d8/0x1470
[   68.296740]  #2: ffff80000c1215b0 (shrinker_rwsem){++++}-{3:3}, at: 
shrink_slab+0xc0/0x408
[   68.296774]
[   68.296774] stack backtrace:
[   68.296780] CPU: 2 PID: 295 Comm: cc1 Not tainted 5.19.0-rc3+ #400
[   68.296794] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Sep  3 2019
[   68.296803] Call trace:
[   68.296808]  dump_backtrace+0x1e4/0x1f0
[   68.296821]  show_stack+0x20/0x70
[   68.296832]  dump_stack_lvl+0x8c/0xb8
[   68.296849]  dump_stack+0x1c/0x38
[   68.296864]  print_circular_bug.isra.0+0x284/0x378
[   68.296881]  check_noncircular+0x1d8/0x1f8
[   68.296896]  __lock_acquire+0x1724/0x2398
[   68.296911]  lock_acquire+0x218/0x5b0
[   68.296926]  __ww_mutex_lock.constprop.0+0x158/0x2378
[   68.296942]  ww_mutex_lock+0x7c/0x4d8
[   68.296956]  drm_gem_shmem_free+0x7c/0x198
[   68.296970]  panfrost_gem_free_object+0x118/0x138
[   68.296984]  drm_gem_object_free+0x40/0x68
[   68.296999]  drm_gem_shmem_shrinker_run_objects_scan+0x42c/0x5b8
[   68.297017]  drm_gem_shmem_shrinker_scan_objects+0xa4/0x170
[   68.297033]  do_shrink_slab+0x220/0x808
[   68.297045]  shrink_slab+0x11c/0x408
[   68.297056]  shrink_node+0x6ac/0xb90
[   68.297068]  do_try_to_free_pages+0x1dc/0x8d0
[   68.297081]  try_to_free_pages+0x1ec/0x5b0
[   68.297094]  __alloc_pages_slowpath.constprop.0+0x528/0x1470
[   68.297110]  __alloc_pages+0x4e0/0x5b8
[   68.297122]  __folio_alloc+0x24/0x60
[   68.297134]  vma_alloc_folio+0xb8/0x2f8
[   68.297148]  alloc_zeroed_user_highpage_movable+0x58/0x68
[   68.297163]  __handle_mm_fault+0x918/0x12a8
[   68.297178]  handle_mm_fault+0x130/0x300
[   68.297193]  do_page_fault+0x1d0/0x568
[   68.297205]  do_translation_fault+0xa0/0xb8
[   68.297218]  do_mem_abort+0x68/0xf8
[   68.297229]  el0_da+0x74/0x100
[   68.297239]  el0t_64_sync_handler+0x68/0xc0
[   68.297252]  el0t_64_sync+0x18c/0x190
[   68.471812] arm-scmi firmware:scmi: timed out in resp(caller: 
scmi_power_state_set+0x11c/0x190)
[   68.501947] arm-scmi firmware:scmi: Message for 119 type 0 is not 
expected!
[   68.939686] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000915e2d34
[   69.739386] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ac77ac55
[   70.415329] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ee980c7e
[   70.987166] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ffb7ff37
[   71.914939] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000000e92b26e
[   72.426987] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000c036a911
[   73.578683] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000001c6fc094
[   74.090555] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000075d00f9
[   74.922709] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000005add546
[   75.434401] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000000154189b
[   76.394300] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ac77ac55
[   76.906236] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ee980c7e
[   79.657234] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000f6d059fb
[   80.168831] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000061a0f6bf
[   80.808354] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000071ade02
[   81.319967] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000b0afea73
[   81.831574] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000d78f36c2
[   82.343160] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000000f689397
[   83.046689] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000412c2a2f
[   83.558352] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000020e551b3
[   84.261913] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000009437aace
[   84.773576] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000001c6fc094
[   85.317275] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000c036a911
[   85.829035] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000000e92b26e
[   86.660555] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000ac77ac55
[   87.172126] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000b940e406
[   87.875846] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000001c6fc094
[   88.387443] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000009437aace
[   89.059175] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000075dadb7f
[   89.570960] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000005add546
[   90.146687] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000cba2873c
[   90.662497] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000a4beb490
[   95.392748] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000005b5fc4ec
[   95.904179] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000a17436ee
[   96.416085] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000003888d2a7
[   96.927874] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000093e04a98
[   97.439742] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000c036a911
[   97.954109] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000084c51113
[   98.467374] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000664663ce
[   98.975192] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=0000000060f2d45c
[   99.487231] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000b29288f8
[   99.998833] panfrost 2d000000.gpu: gpu sched timeout, js=0, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000002f07ab24
[  100.510744] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000008c15c751
[  100.511411] 
==================================================================
[  100.511419] BUG: KASAN: use-after-free in irq_work_single+0xa4/0x110
[  100.511445] Write of size 4 at addr ffff0000107f5830 by task 
glmark2-es2-drm/280
[  100.511458]
[  100.511464] CPU: 1 PID: 280 Comm: glmark2-es2-drm Not tainted 
5.19.0-rc3+ #400
[  100.511479] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Sep  3 2019
[  100.511489] Call trace:
[  100.511494]  dump_backtrace+0x1e4/0x1f0
[  100.511512]  show_stack+0x20/0x70
[  100.511523]  dump_stack_lvl+0x8c/0xb8
[  100.511543]  print_report+0x16c/0x668
[  100.511559]  kasan_report+0x80/0x208
[  100.511574]  kasan_check_range+0x100/0x1b8
[  100.511590]  __kasan_check_write+0x34/0x60
[  100.511607]  irq_work_single+0xa4/0x110
[  100.511619]  irq_work_run_list+0x6c/0x88
[  100.511632]  irq_work_run+0x28/0x48
[  100.511644]  ipi_handler+0x254/0x468
[  100.511664]  handle_percpu_devid_irq+0x11c/0x518
[  100.511681]  generic_handle_domain_irq+0x50/0x70
[  100.511699]  gic_handle_irq+0xd4/0x118
[  100.511711]  call_on_irq_stack+0x2c/0x58
[  100.511725]  do_interrupt_handler+0xc0/0xc8
[  100.511741]  el1_interrupt+0x40/0x68
[  100.511754]  el1h_64_irq_handler+0x18/0x28
[  100.511767]  el1h_64_irq+0x64/0x68
[  100.511778]  irq_work_queue+0xc0/0xd8
[  100.511790]  drm_sched_entity_fini+0x2c4/0x3b0
[  100.511805]  drm_sched_entity_destroy+0x2c/0x40
[  100.511818]  panfrost_job_close+0x44/0x1c0
[  100.511833]  panfrost_postclose+0x38/0x60
[  100.511845]  drm_file_free.part.0+0x33c/0x4b8
[  100.511862]  drm_close_helper.isra.0+0xc0/0xd8
[  100.511877]  drm_release+0xe4/0x1e0
[  100.511891]  __fput+0xf8/0x390
[  100.511904]  ____fput+0x18/0x28
[  100.511917]  task_work_run+0xc4/0x1e0
[  100.511929]  do_exit+0x554/0x1168
[  100.511945]  do_group_exit+0x60/0x108
[  100.511960]  __arm64_sys_exit_group+0x34/0x38
[  100.511977]  invoke_syscall+0x64/0x180
[  100.511993]  el0_svc_common.constprop.0+0x13c/0x170
[  100.512012]  do_el0_svc+0x48/0xe8
[  100.512028]  el0_svc+0x5c/0xe0
[  100.512038]  el0t_64_sync_handler+0xb8/0xc0
[  100.512051]  el0t_64_sync+0x18c/0x190
[  100.512064]
[  100.512068] Allocated by task 280:
[  100.512075]  kasan_save_stack+0x2c/0x58
[  100.512091]  __kasan_kmalloc+0x90/0xb8
[  100.512105]  kmem_cache_alloc_trace+0x1d4/0x330
[  100.512118]  panfrost_ioctl_submit+0x100/0x630
[  100.512131]  drm_ioctl_kernel+0x160/0x250
[  100.512147]  drm_ioctl+0x36c/0x628
[  100.512161]  __arm64_sys_ioctl+0xd8/0x120
[  100.512178]  invoke_syscall+0x64/0x180
[  100.512194]  el0_svc_common.constprop.0+0x13c/0x170
[  100.512211]  do_el0_svc+0x48/0xe8
[  100.512226]  el0_svc+0x5c/0xe0
[  100.512236]  el0t_64_sync_handler+0xb8/0xc0
[  100.512248]  el0t_64_sync+0x18c/0x190
[  100.512259]
[  100.512262] Freed by task 280:
[  100.512268]  kasan_save_stack+0x2c/0x58
[  100.512283]  kasan_set_track+0x2c/0x40
[  100.512296]  kasan_set_free_info+0x28/0x50
[  100.512312]  __kasan_slab_free+0xf0/0x170
[  100.512326]  kfree+0x124/0x418
[  100.512337]  panfrost_job_cleanup+0x1f0/0x298
[  100.512350]  panfrost_job_free+0x80/0xb0
[  100.512363]  drm_sched_entity_kill_jobs_irq_work+0x80/0xa0
[  100.512377]  irq_work_single+0x88/0x110
[  100.512389]  irq_work_run_list+0x6c/0x88
[  100.512401]  irq_work_run+0x28/0x48
[  100.512413]  ipi_handler+0x254/0x468
[  100.512427]  handle_percpu_devid_irq+0x11c/0x518
[  100.512443]  generic_handle_domain_irq+0x50/0x70
[  100.512460]  gic_handle_irq+0xd4/0x118
[  100.512471]
[  100.512474] The buggy address belongs to the object at ffff0000107f5800
[  100.512474]  which belongs to the cache kmalloc-512 of size 512
[  100.512484] The buggy address is located 48 bytes inside of
[  100.512484]  512-byte region [ffff0000107f5800, ffff0000107f5a00)
[  100.512497]
[  100.512500] The buggy address belongs to the physical page:
[  100.512506] page:000000000a626feb refcount:1 mapcount:0 
mapping:0000000000000000 index:0x0 pfn:0x907f4
[  100.512520] head:000000000a626feb order:2 compound_mapcount:0 
compound_pincount:0
[  100.512530] flags: 
0xffff00000010200(slab|head|node=0|zone=0|lastcpupid=0xffff)
[  100.512556] raw: 0ffff00000010200 fffffc0000076400 dead000000000002 
ffff000000002600
[  100.512569] raw: 0000000000000000 0000000080100010 00000001ffffffff 
0000000000000000
[  100.512577] page dumped because: kasan: bad access detected
[  100.512582]
[  100.512585] Memory state around the buggy address:
[  100.512592]  ffff0000107f5700: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  100.512602]  ffff0000107f5780: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[  100.512612] >ffff0000107f5800: fa fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  100.512619]                                      ^
[  100.512627]  ffff0000107f5880: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  100.512636]  ffff0000107f5900: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[  100.512643] 
==================================================================
[  101.022573] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000be4b1b31
[  101.534469] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=00000000a8ff2c8a
[  101.535981] BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:870
[  101.535994] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 
280, name: glmark2-es2-drm
[  101.536006] preempt_count: 10000, expected: 0
[  101.536012] RCU nest depth: 0, expected: 0
[  101.536019] INFO: lockdep is turned off.
[  101.536023] irq event stamp: 1666508
[  101.536029] hardirqs last  enabled at (1666507): [<ffff80000997ed70>] 
exit_to_kernel_mode.isra.0+0x40/0x140
[  101.536056] hardirqs last disabled at (1666508): [<ffff800009985030>] 
__schedule+0xb38/0xea8
[  101.536076] softirqs last  enabled at (1664950): [<ffff800008010ac8>] 
__do_softirq+0x6b8/0x89c
[  101.536092] softirqs last disabled at (1664941): [<ffff8000080e4fdc>] 
irq_exit_rcu+0x27c/0x2b0
[  101.536118] CPU: 1 PID: 280 Comm: glmark2-es2-drm Tainted: G    B 
          5.19.0-rc3+ #400
[  101.536134] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Sep  3 2019
[  101.536143] Call trace:
[  101.536147]  dump_backtrace+0x1e4/0x1f0
[  101.536161]  show_stack+0x20/0x70
[  101.536171]  dump_stack_lvl+0x8c/0xb8
[  101.536189]  dump_stack+0x1c/0x38
[  101.536204]  __might_resched+0x1f0/0x2b0
[  101.536220]  __might_sleep+0x74/0xd0
[  101.536234]  ww_mutex_lock+0x40/0x4d8
[  101.536249]  drm_gem_shmem_free+0x7c/0x198
[  101.536264]  panfrost_gem_free_object+0x118/0x138
[  101.536278]  drm_gem_object_free+0x40/0x68
[  101.536295]  panfrost_job_cleanup+0x1bc/0x298
[  101.536309]  panfrost_job_free+0x80/0xb0
[  101.536322]  drm_sched_entity_kill_jobs_irq_work+0x80/0xa0
[  101.536337]  irq_work_single+0x88/0x110
[  101.536351]  irq_work_run_list+0x6c/0x88
[  101.536364]  irq_work_run+0x28/0x48
[  101.536375]  ipi_handler+0x254/0x468
[  101.536392]  handle_percpu_devid_irq+0x11c/0x518
[  101.536409]  generic_handle_domain_irq+0x50/0x70
[  101.536428]  gic_handle_irq+0xd4/0x118
[  101.536439]  call_on_irq_stack+0x2c/0x58
[  101.536453]  do_interrupt_handler+0xc0/0xc8
[  101.536468]  el1_interrupt+0x40/0x68
[  101.536479]  el1h_64_irq_handler+0x18/0x28
[  101.536492]  el1h_64_irq+0x64/0x68
[  101.536503]  __asan_load8+0x30/0xd0
[  101.536519]  drm_sched_entity_fini+0x1e8/0x3b0
[  101.536532]  drm_sched_entity_destroy+0x2c/0x40
[  101.536545]  panfrost_job_close+0x44/0x1c0
[  101.536559]  panfrost_postclose+0x38/0x60
[  101.536571]  drm_file_free.part.0+0x33c/0x4b8
[  101.536586]  drm_close_helper.isra.0+0xc0/0xd8
[  101.536601]  drm_release+0xe4/0x1e0
[  101.536615]  __fput+0xf8/0x390
[  101.536628]  ____fput+0x18/0x28
[  101.536640]  task_work_run+0xc4/0x1e0
[  101.536652]  do_exit+0x554/0x1168
[  101.536667]  do_group_exit+0x60/0x108
[  101.536682]  __arm64_sys_exit_group+0x34/0x38
[  101.536698]  invoke_syscall+0x64/0x180
[  101.536714]  el0_svc_common.constprop.0+0x13c/0x170
[  101.536733]  do_el0_svc+0x48/0xe8
[  101.536748]  el0_svc+0x5c/0xe0
[  101.536759]  el0t_64_sync_handler+0xb8/0xc0
[  101.536771]  el0t_64_sync+0x18c/0x190
[  101.541928] ------------[ cut here ]------------
[  101.541934] kernel BUG at kernel/irq_work.c:235!
[  101.541944] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  101.541961] Modules linked in:
[  101.541978] CPU: 1 PID: 280 Comm: glmark2-es2-drm Tainted: G    B   W 
         5.19.0-rc3+ #400
[  101.541997] Hardware name: ARM LTD ARM Juno Development Platform/ARM 
Juno Development Platform, BIOS EDK II Sep  3 2019
[  101.542009] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS 
BTYPE=--)
[  101.542027] pc : irq_work_run_list+0x80/0x88
[  101.542044] lr : irq_work_run+0x34/0x48
[  101.542060] sp : ffff80000da37eb0
[  101.542069] x29: ffff80000da37eb0 x28: ffff000006bb0000 x27: 
ffff000006bb0008
[  101.542107] x26: ffff80000da37f20 x25: ffff8000080304d8 x24: 
0000000000000001
[  101.542142] x23: ffff80000abcd008 x22: ffff80000da37ed0 x21: 
ffff80001c0de000
[  101.542177] x20: ffff80000abcd008 x19: ffff80000abdbad0 x18: 
0000000000000000
[  101.542212] x17: 616e202c30383220 x16: 3a646970202c3020 x15: 
ffff8000082df9d0
[  101.542246] x14: ffff800008dfada8 x13: 0000000000000003 x12: 
1fffe000018b2a06
[  101.542280] x11: ffff6000018b2a06 x10: dfff800000000000 x9 : 
ffff00000c595033
[  101.542315] x8 : ffff6000018b2a07 x7 : 0000000000000001 x6 : 
00000000000000fb
[  101.542349] x5 : ffff00000c595030 x4 : 0000000000000000 x3 : 
ffff00000c595030
[  101.542382] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 
ffff000026cb9ad0
[  101.542416] Call trace:
[  101.542424]  irq_work_run_list+0x80/0x88
[  101.542441]  ipi_handler+0x254/0x468
[  101.542460]  handle_percpu_devid_irq+0x11c/0x518
[  101.542480]  generic_handle_domain_irq+0x50/0x70
[  101.542501]  gic_handle_irq+0xd4/0x118
[  101.542516]  call_on_irq_stack+0x2c/0x58
[  101.542534]  do_interrupt_handler+0xc0/0xc8
[  101.542553]  el1_interrupt+0x40/0x68
[  101.542568]  el1h_64_irq_handler+0x18/0x28
[  101.542584]  el1h_64_irq+0x64/0x68
[  101.542599]  __asan_load8+0x30/0xd0
[  101.542617]  drm_sched_entity_fini+0x1e8/0x3b0
[  101.542634]  drm_sched_entity_destroy+0x2c/0x40
[  101.542651]  panfrost_job_close+0x44/0x1c0
[  101.542669]  panfrost_postclose+0x38/0x60
[  101.542685]  drm_file_free.part.0+0x33c/0x4b8
[  101.542704]  drm_close_helper.isra.0+0xc0/0xd8
[  101.542723]  drm_release+0xe4/0x1e0
[  101.542740]  __fput+0xf8/0x390
[  101.542756]  ____fput+0x18/0x28
[  101.542773]  task_work_run+0xc4/0x1e0
[  101.542788]  do_exit+0x554/0x1168
[  101.542806]  do_group_exit+0x60/0x108
[  101.542825]  __arm64_sys_exit_group+0x34/0x38
[  101.542845]  invoke_syscall+0x64/0x180
[  101.542865]  el0_svc_common.constprop.0+0x13c/0x170
[  101.542887]  do_el0_svc+0x48/0xe8
[  101.542906]  el0_svc+0x5c/0xe0
[  101.542921]  el0t_64_sync_handler+0xb8/0xc0
[  101.542938]  el0t_64_sync+0x18c/0x190
[  101.542960] Code: a94153f3 a8c27bfd d50323bf d65f03c0 (d4210000)
[  101.542979] ---[ end trace 0000000000000000 ]---
[  101.678650] Kernel panic - not syncing: Oops - BUG: Fatal exception 
in interrupt
[  102.046301] panfrost 2d000000.gpu: gpu sched timeout, js=1, 
config=0x0, status=0x0, head=0x0, tail=0x0, sched_job=000000001da14c98
[  103.227334] SMP: stopping secondary CPUs
[  103.241055] Kernel Offset: disabled
[  103.254316] CPU features: 0x800,00184810,00001086
[  103.268904] Memory Limit: 800 MB
[  103.411625] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal 
exception in interrupt ]---


More information about the amd-gfx mailing list