[PATCH v3 0/5] udmbuf bug fix and some improvements
Huan Yang
link at vivo.com
Tue Aug 13 09:05:06 UTC 2024
This patchset attempts to fix some errors in udmabuf and remove the
upin_list structure.
Some of this fix just gather the patches which I upload before.
Patch 1,2,4,5 has passed the udmabuf self-test suite's tests.
Suggested by Kasireddy, Vivek <vivek.kasireddy at intel.com>
Patch5 modified the unpin function, therefore running the udmabuf
self-test program in a loop did not reveal any memory leaks.
Notice: Test item 6 maybe requires running the command:
echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Patch1
===
Try to remove page fault mmap and direct map it.
Due to current udmabuf has already obtained and pinned the folio
upon completion of the creation.This means that the physical memory has
already been acquired, rather than being accessed dynamically. The
current page fault method only saves some page table memory.
As a result, the page fault mechanism has lost its purpose as a demanding
page. Due to the fact that page fault requires trapping into kernel mode
and filling in when accessing the corresponding virtual address in mmap,
this means that user mode access to virtual addresses needs to trap into
kernel mode.
Therefore, when creating a large size udmabuf, this represents a
considerable overhead.
Patch2
===
This is the same to patch:
https://lore.kernel.org/all/20240725021349.580574-1-link@vivo.com/
Patch3
===
The current implementation of udmabuf's vmap has issues.
It does not correctly set each page of the folio to the page structure,
so that when vmap is called, all pages are the head page of the folio.
Due to udmabuf can use hugetlb, if HVO enabled, tail page may not exist,
so, we can't use page array to map, instead, use pfn array.
Patch4
===
Change codestyle and fix a potential bug.
There are some variables in udmabuf_create that are only used inside the
loop. Therefore, there is no need to declare them outside the scope.
This patch moved it into loop.
It is difficult to understand the loop condition of the code that adds
folio to the unpin_list.
The outer loop of this patch iterates through folios, while the inner
loop correctly sets the folio and corresponding offset into the udmabuf
starting from the offset. if reach to pgcnt or nr_folios, end of loop.
By this, more readable.
ubuf->pagecount already set before true set in loop, if get some error
when create, This means that pagecount and folios are not equivalent,
which could lead to potential issues when release.
This patch dynamic update ubuf->pagecount only when folios update end.
Patch5
===
Attempt to remove unpin_list and other related data structures.
In order to adapt to Folio, we established the unpin_list data structure
to unpin all folios and maintain the page mapping relationship.
However, this data structure requires 24 bytes for each page and has low
traversal performance for the list. And maintaining the offset structure
also consumes a portion of memory.
This patch attempts to remove these data structures.
Considering that during creation, we arranged the folio array in the
order of pin and set the offset according to pgcnt.
We actually don't need to use unpin_list to unpin during release.
Instead, we can iterate through the folios array during release and
unpin any folio that is different from the ones previously accessed.
By this, not only saves the overhead of the udmabuf_folio data structure
but also makes array access more cache-friendly.
Changelog
===
v3 -> v2:
Patch1, avoid use page, instead, use pfn, and use vmf_insert_pfn map
suggested-by Kasireddy, Vivek <vivek.kasireddy at intel.com>
Patch2, update acked-by Kasireddy, Vivek <vivek.kasireddy at intel.com>
And keep the kvcalloc on the same line.
Patch3, avoid use page, instead, use pfn, then use vmap_pfn map
Patch4, split v2 patch4, single update codestyle to keep review
easy.
Patch5, another way to remove udmabuf_folio
---
v2 -> v1:
Patch1, 3 Rectify the improper use of the sg table.
suggested-by Christian König <christian.koenig at amd.com>
Patch2 add acked-by Christian K�nig <christian.koenig at amd.com> which
marked in v1
Patch4
Modify the data structure to restore the use of pages and
correct the misunderstanding of loop conditions such as "pgcnt".
make sure pass self test.
remove v1's patch4
v2
https://lore.kernel.org/all/20240805032550.3912454-1-link@vivo.com/
v1
https://lore.kernel.org/all/20240801104512.4056860-1-link@vivo.com/
Huan Yang (5):
udmabuf: cancel mmap page fault, direct map it
udmabuf: change folios array from kmalloc to kvmalloc
fix vmap_udmabuf error page set
udmabuf: codestyle cleanup
udmabuf: remove udmabuf_folio
drivers/dma-buf/udmabuf.c | 198 ++++++++++++++++++--------------------
1 file changed, 96 insertions(+), 102 deletions(-)
base-commit: 033a4691702cdca3a613256b0623b8eeacb4985e
--
2.45.2
More information about the dri-devel
mailing list