[PATCH 0/5] udmbuf bug fix and some improvements

Huan Yang link at vivo.com
Fri Aug 2 01:16:40 UTC 2024


在 2024/8/2 2:32, Kasireddy, Vivek 写道:
> [Some people who received this message don't often get email from vivek.kasireddy at intel.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Hi Huan,
>
>> This patchset attempts to fix some errors in udmabuf and remove the
>> upin_list structure.
>>
>> Some of this fix just gather the patches which I upload before.
>>
>> Patch1
>> ===
>> Try to remove page fault mmap and direct map it.
>> Due to current udmabuf has already obtained and pinned the folio
>> upon completion of the creation.This means that the physical memory has
>> already been acquired, rather than being accessed dynamically. The
>> current page fault method only saves some page table memory.
>>
>> As a result, the page fault mechanism has lost its purpose as a demanding
>> page. Due to the fact that page fault requires trapping into kernel mode
>> and filling in when accessing the corresponding virtual address in mmap,
>> this means that user mode access to virtual addresses needs to trap into
>> kernel mode.
>>
>> Therefore, when creating a large size udmabuf, this represents a
>> considerable overhead.
> Just want to mention that for the main use-case the udmabuf driver is designed for,
> (sharing Qemu Guest FB with Host for GPU DMA), udmabufs are not created very
> frequently. And, I think providing CPU access via mmap is just a backup, mainly
> intended for debugging purposes.

I'm very glad to know this.

However, recently I have been researching on using asynchronous and 
direct I/O (DIO) when loading large model files with dma-buf,

which can improve performance and reduce power consumption. You can see 
the patchset:

https://lore.kernel.org/all/20240730075755.10941-1-link@vivo.com/

In the discussion, the maintainer suggested that we should base our work 
on udmabuf. I tested udmabuf and found that using asynchronous

and direct I/O (DIO) to read files performs similarly to my patchset.

So I turned to studying udmabuf, and once I become familiar with the 
system, I will be able to encourage our partners to adapt it.

>
>> Therefore, the current patch removes the page fault method of mmap and
>> instead fills it directly when mmap is triggered.
>>
>> This is achieved by using the scatter-gather table to establish a
>> linear relationship for the page. Calling remap_pfn_range does not cause
>> the previously set VMA flags to become invalid.
>>
>> Patch2
>> ===
>> This is the same to patch:
>> https://lore.kernel.org/all/20240725021349.580574-1-link@vivo.com/
>> I just gather it to this patchset.
>>
>> Patch3
>> ===
>> The current implementation of udmabuf's vmap has issues.
>>
>> It does not correctly set each page of the folio to the page structure,
>> so that when vmap is called, all pages are the head page of the folio.
>>
>> This implementation is not the same as this patch:
>> https://lore.kernel.org/all/20240731090233.1343559-1-link@vivo.com/
>>
>> This reuse sgt table to map all page into vmalloc area.
>>
>> Patch4
>> ===
>> Wrap the repeated calls to get_sg_table, add a helper function to do it.
>> Set to udmabuf->sg use cmpxchg, It should be able to prevent concurrent
>> access situations. (I see mmap do not use lock)
>>
>> Patch5
>> ===
>> Attempt to remove unpin_list and other related data structures.
>>
>> In order to adapt to Folio, we established the unpin_list data structure
>> to unpin all folios and maintain the page mapping relationship.
>>
>> However, this data structure requires 24 bytes for each page and has low
>> traversal performance for the list. And maintaining the offset structure
>> also consumes a portion of memory.
>>
>> This patch attempts to remove these data structures and modify the
>> semantics of some existing data structures.
>>
>> udmabuf:
>>    folios -> folios array, which only contain's the folio, org contains
>> duplicate.
>>    add item_offset -> base on create item count, record it's start offset
>> in every memfd.
>>    add item_size -> base on create item count, record it's size in every
>> memfd.
>>    add nr_folios -> folios array number
> I am not sure if these changes improve the readability. Instead, I think it makes
> sense to add comments to the existing code.

This is not aimed at improving readability, but rather at saving memory 
and performance,

as unpin_list is 24 bytes for each folio.

If each folio is 24 bytes, it would result in a lot of performance loss.

I previously provided a patch to establish a kmem_cache to reduce memory 
waste, but after recent study,

https://lore.kernel.org/all/20240731062642.1164140-1-link@vivo.com/(This 
patch forget to unregister when model exit)

I believe that the unpin_list may not need to be constructed, and 
instead, operations can be directly based on the folio array.


>
>> So, when building the sg table, it is necessary to iterate in this way:
>>    if size cross item->size, take care of it's start offset in folio.
>>    if got folio, set each page into sgl until reach into folio size.
>>
>> This patch also remove single folios' create on each create item, use it
>> be the ubuf->folios arrays' pointer, slide to fill the corresponding
>> folio under the item into the array.
>>
>> After the modification, the various data structures in udmabuf have the
>> following corresponding relationships:
>>    pagecount * PAGESIZE = sum(folios_size(folios[i])) i=0->nr_folios
>>    pagecount * PAGESIZE = sum(item_size[i]) i=0, item_count (do not
>> record)
>>    item_offset use to record each memfd offset if exist, else 0.
>>
>> Huan Yang (5):
>>    udmabuf: cancel mmap page fault, direct map it
>>    udmabuf: change folios array from kmalloc to kvmalloc
>>    udmabuf: fix vmap_udmabuf error page set
> Do you have a test-case to test this patch?
>
>>    udmabuf: add get_sg_table helper function
>>    udmabuf: remove folio pin list
> Please run the newly added udmabuf selftests to make sure that these

Yes, you're right, when I release the next patch, I will include it. 
Thank you for point this.

Christian König reminded me not to build page associations based on the 
sg table, which I had not considered.

Therefore, the overall logic of the patch needs to be revised.

> patches are not causing any regressions. And, we also need to make sure that
> the main use-cases (Qemu with memfd + shmem and Qemu with memfd + hugetlb)
> are working as expected given the invasive changes.
>
> I'll be able to test and provide more detailed feedback on all patches once I am back from
> vacation late next week.

Wish you a pleasant holiday.
Thank you.
>
> Thanks,
> Vivek
>
>>   drivers/dma-buf/udmabuf.c | 270 +++++++++++++++++++++-----------------
>>   1 file changed, 148 insertions(+), 122 deletions(-)
>>
>>
>> base-commit: cd19ac2f903276b820f5d0d89de0c896c27036ed
>> --
>> 2.45.2


More information about the dri-devel mailing list