Issue on unmap_grant_pages before releasing dmabuf
Oleksii Moisieiev
Oleksii_Moisieiev at epam.com
Wed Oct 19 10:43:30 UTC 2022
Greetings.
I need your advise about the problem I'm facing right now:
I'm working on the new type of dmabuf export implementation. This
is going to be new ioctl to the gntdev and it's purpose is to use
external buffer, provided by file descriptor as the backing storage
during export to grant refs.
Few words about the functionality I'm working on right now:
My setup is based on IMX8QM (please see PS below if you need
configuration details)
When using dma-buf exporter to create dma-buf with backing storage and
map it to the grant refs, provided from the domain, we've met a problem,
that several HW (i.MX8 gpu in our case) do not support external buffer
and requires backing storage to be created using it's native tools
(eglCreateImageKHR returns EGL_NO_IMAGE_KHR for buffers, which were not
created using gbm_bo_create).
That's why new ioctls were added to be able to pass existing dma-buffer
fd as input parameter and use it as backing storage to export to refs.
Kernel version on IMX8QM board is 5.10.72 and itworks fine on this kernel
version.
New ioctls source code can be found here:
https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2
Now regarding the problem I've met when rebased those code on master:
On my test stand I use DRM_IOCTL_MODE_CREATE_DUMB/DRM_IOCTL_MODE_DESTROY_DUMB ioctls
to allocate buffer and I'm observing the following backtrace on DRM_IOCTL_MODE_DESTROY_DUMB:
Unable to handle kernel paging request at virtual address 0000000387000098
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000006df98000
[0000000387000098] pgd=0800000064f4f003, p4d=0800000064f4f003, pud=0000000000000000
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in: xen_pciback overlay crct10dif_ce ip_tables x_tables ipv6
PU: 0 PID: 34 Comm: kworker/0:1 Not tainted 6.0.0 #85
Hardware name: linux,dummy-virt (DT)
Workqueue: events virtio_gpu_dequeue_ctrl_func
pstate: 000000c5 (nzcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : check_move_unevictable_folios+0xb8/0x4d0
lr : check_move_unevictable_folios+0xb4/0x4d0
sp : ffff8000081a3ad0
x29: ffff8000081a3ad0 x28: ffff03856ac98800 x27: 0000000000000000
x26: ffffde7b168ee9d8 x25: ffff03856ae26008 x24: 0000000000000000
x23: ffffde7b1758d6c0 x22: 0000000000000001 x21: ffff8000081a3b68
x20: 0000000000000001 x19: fffffc0e15935040 x18: ffffffffffffffff
x17: ffff250a68e3d000 x16: 0000000000000012 x15: ffff8000881a38d7
x14: 0000000000000000 x13: ffffde7b175a3150 x12: 0000000000002c55
x11: 0000000000000ec7 x10: ffffde7b176113f8 x9 : ffffde7b175a3150
x8 : 0000000100004ec7 x7 : ffffde7b175fb150 x6 : ffff8000081a3b70
x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffff03856ac98850
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000387000000
Call trace:
check_move_unevictable_folios+0xb8/0x4d0
check_move_unevictable_pages+0x8c/0x110
drm_gem_put_pages+0x118/0x198
drm_gem_shmem_put_pages_locked+0x4c/0x70
drm_gem_shmem_unpin+0x30/0x50
virtio_gpu_cleanup_object+0x84/0x130
virtio_gpu_cmd_unref_cb+0x18/0x2c
virtio_gpu_dequeue_ctrl_func+0x124/0x290
process_one_work+0x1d0/0x320
worker_thread+0x14c/0x444
kthread+0x10c/0x110
ret_from_fork+0x10/0x20
Code: 97fc3fe1 aa1303e0 94003ac7 b4000080 (f9404c00)
---[ end trace 0000000000000000 ]---
After some investigation I think I've found the cause of the problem:
This is the functionality, added in commit 3f9f1c67572f5e5e6dc84216d48d1480f3c4fcf6 which
introduces safe mechanism to unmap grant pages which is waiting until page_count(page) = 1
before doing unmap.
The problem is that DRM_IOCTL_MODE_CREATE_DUMB creates buffer where page_count(page) = 2.
On my QEMU test stand I'm using Xen 4.16 (aarch64) with debian based Dom0 + DomU on the latest
kernels.
I've created some apps for testing:
The first one is to allocate grant refs on DomU:
https://github.com/oleksiimoisieiev/linux/tree/gntdev_map_buf_upstr_for-linus-6.1_2
The name is test.ko and it can be built using command:
cd ./test; make
NOTE: makefile expects kernel build to be present on ../../test-build directory.
It should be run on DomU using command:
insmod test.ko; echo "create" > /sys/kernel/etx_sysfs/etx_value
Result will be the following:
[ 126.104903] test: loading out-of-tree module taints kernel.
[ 126.150586] Sysfs - Write!!!
[ 126.150773] create
[ 126.150773]
[ 126.150888] Hello, World!
[ 126.151203] Hello, World!
[ 126.151324] gref 301
[ 126.151376] addr ffff00000883d000
[ 126.151431] gref 302
[ 126.151454] addr ffff00000883e000
[ 126.151478] gref 303
[ 126.151497] addr ffff00000883f000
[ 126.151525] gref 304
[ 126.151546] addr ffff000008840000
[ 126.151573] gref 305
[ 126.151593] addr ffff000008841000
The second is for dom0 and can be found here:
https://github.com/oleksiimoisieiev/xen/tree/gntdev_fd
How to build:
make -C tools/console all
Result: ./tools/console/gnt_test should be uploaded to Dom0
Start: sudo ./gnt_test_map 1 301 302 303 304 305
Where 1 is DomU ID and 301 302 303 304 305 - grefs from test.ko output
This will create buffer using ioctls DRM_IOCTL_MODE_CREATE_DUMB them passes it as backing
storage to gntdev and then destroys it using DRM_IOCTL_MODE_DESTROY_DUMB.
The problem is that when dumb buffer is created we observe page_count(page) = 2. So
when before buffer release I'm trying to unmap grant refs using unmap_grant_pages it is calling
__gnttab_unmap_refs_async, which postpones actual unmapping to 5 ms because
page_count(page) > 1.
Which causes drm_gem_get_pages to try to free pages, which are still mapped.
Also if I change in the following line:
https://github.com/torvalds/linux/blob/bb1a1146467ad812bb65440696df0782e2bc63c8/drivers/xen/grant-table.c#L1313
change from page_count(item->pages[pc]) > 1 to page_count(item->pages[pc]) > 2 - everything works fine.
The obvious way for fix this issue I see is to make the expected page_count
for __gnttab_unmap_refs_async configurable for each buffer, but I'm now sure
if this is the
best solution.
I would be happy to hear your thoughts and advises about how to fix this situation.
PS: IMX8QM configuration details:
IMX8QM board has XEN 4.16 with 2 domains:
DomU, which has 1 gpu core passed through and graphics should start;
Dom0, which has the second gpu core and weston.
Weston is starting on DomU via xen drm frontend/backend and using
zwp_linux_dmabuf_v1_interface to implement zerocopy support.
Details of zerocopy initialization:
https://github.com/xen-troops/displ_be/blob/782471533dc1e7b38099a5583256bfe0c2be488c/src/displayBackend/wayland/WaylandZCopy.cpp#L410
Best regards,
Oleksii.
More information about the dri-devel
mailing list