[PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap

wangtao tao.wangtao at honor.com
Tue May 27 14:35:20 UTC 2025



> -----Original Message-----
> From: Christian König <christian.koenig at amd.com>
> Sent: Thursday, May 22, 2025 7:58 PM
> To: wangtao <tao.wangtao at honor.com>; T.J. Mercier
> <tjmercier at google.com>
> Cc: sumit.semwal at linaro.org; benjamin.gaignard at collabora.com;
> Brian.Starkey at arm.com; jstultz at google.com; linux-media at vger.kernel.org;
> dri-devel at lists.freedesktop.org; linaro-mm-sig at lists.linaro.org; linux-
> kernel at vger.kernel.org; wangbintian(BintianWang)
> <bintian.wang at honor.com>; yipengxiang <yipengxiang at honor.com>; liulu
> 00013167 <liulu.liu at honor.com>; hanfeng 00012985 <feng.han at honor.com>;
> amir73il at gmail.com
> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> DMA_BUF_IOCTL_RW_FILE for system_heap
> 
> On 5/22/25 10:02, wangtao wrote:
> >> -----Original Message-----
> >> From: Christian König <christian.koenig at amd.com>
> >> Sent: Wednesday, May 21, 2025 7:57 PM
> >> To: wangtao <tao.wangtao at honor.com>; T.J. Mercier
> >> <tjmercier at google.com>
> >> Cc: sumit.semwal at linaro.org; benjamin.gaignard at collabora.com;
> >> Brian.Starkey at arm.com; jstultz at google.com;
> >> linux-media at vger.kernel.org; dri-devel at lists.freedesktop.org;
> >> linaro-mm-sig at lists.linaro.org; linux- kernel at vger.kernel.org;
> >> wangbintian(BintianWang) <bintian.wang at honor.com>; yipengxiang
> >> <yipengxiang at honor.com>; liulu
> >> 00013167 <liulu.liu at honor.com>; hanfeng 00012985
> >> <feng.han at honor.com>; amir73il at gmail.com
> >> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> >> DMA_BUF_IOCTL_RW_FILE for system_heap
> >>
> >> On 5/21/25 12:25, wangtao wrote:
> >>> [wangtao] I previously explained that
> >>> read/sendfile/splice/copy_file_range
> >>> syscalls can't achieve dmabuf direct IO zero-copy.
> >>
> >> And why can't you work on improving those syscalls instead of
> >> creating a new IOCTL?
> >>
> > [wangtao] As I mentioned in previous emails, these syscalls cannot
> > achieve dmabuf zero-copy due to technical constraints.
> 
> Yeah, and why can't you work on removing those technical constrains?
> 
> What is blocking you from improving the sendfile system call or proposing a
> patch to remove the copy_file_range restrictions?
[wangtao] Since sendfile/splice can't eliminate CPU copies, I skipped cross-FS checks
in copy_file_range when copying memory/disk files.
Will send new patches after completing shmem/udmabuf callback.
Thank you for your attention to this issue.

UFS 4.0 device @4GB/s, Arm64 CPU @1GHz:
| Metrics                  |Creat(us)|Close(us)| I/O(us) |I/O(MB/s)| Vs.%
|--------------------------|---------|---------|---------|---------|-------
| 0)    dmabuf buffer read |   46898 |    4804 | 1173661 |     914 |  100%
| 1)   udmabuf buffer read |  593844 |  337111 | 2144681 |     500 |   54%
| 2)     memfd buffer read |    1029 |  305322 | 2215859 |     484 |   52%
| 3)     memfd direct read |     562 |  295239 | 1019913 |    1052 |  115%
| 4) memfd buffer sendfile |     785 |  299026 | 1431304 |     750 |   82%
| 5) memfd direct sendfile |     718 |  296307 | 2622270 |     409 |   44%
| 6)   memfd buffer splice |     981 |  299694 | 1573710 |     682 |   74%
| 7)   memfd direct splice |     890 |  302509 | 1269757 |     845 |   92%
| 8)    memfd buffer c_f_r |      33 |    4432 |     N/A |     N/A |   N/A
| 9)    memfd direct c_f_r |      27 |    4421 |     N/A |     N/A |   N/A
|10) memfd buffer sendfile |  595797 |  423105 | 1242494 |     864 |   94%
|11) memfd direct sendfile |  593758 |  357921 | 2344001 |     458 |   50%
|12)   memfd buffer splice |  623221 |  356212 | 1117507 |     960 |  105%
|13)   memfd direct splice |  587059 |  345484 |  857103 |    1252 |  136%
|14)  udmabuf buffer c_f_r |   22725 |   10248 |     N/A |     N/A |   N/A
|15)  udmabuf direct c_f_r |   20120 |    9952 |     N/A |     N/A |   N/A
|16)   dmabuf buffer c_f_r |   46517 |    4708 |  857587 |    1252 |  136%
|17)   dmabuf direct c_f_r |   47339 |    4661 |  284023 |    3780 |  413%

> 
> Regards,
> Christian.
> 
>  Could you
> > specify the technical points, code, or principles that need
> > optimization?
> >
> > Let me explain again why these syscalls can't work:
> > 1. read() syscall
> >    - dmabuf fops lacks read callback implementation. Even if implemented,
> >      file_fd info cannot be transferred
> >    - read(file_fd, dmabuf_ptr, len) with remap_pfn_range-based mmap
> >      cannot access dmabuf_buf pages, forcing buffer-mode reads
> >
> > 2. sendfile() syscall
> >    - Requires CPU copy from page cache to memory file(tmpfs/shmem):
> >      [DISK] --DMA--> [page cache] --CPU copy--> [MEMORY file]
> >    - CPU overhead (both buffer/direct modes involve copies):
> >      55.08% do_sendfile
> >     |- 55.08% do_splice_direct
> >     |-|- 55.08% splice_direct_to_actor
> >     |-|-|- 22.51% copy_splice_read
> >     |-|-|-|- 16.57% f2fs_file_read_iter
> >     |-|-|-|-|- 15.12% __iomap_dio_rw
> >     |-|-|- 32.33% direct_splice_actor
> >     |-|-|-|- 32.11% iter_file_splice_write
> >     |-|-|-|-|- 28.42% vfs_iter_write
> >     |-|-|-|-|-|- 28.42% do_iter_write
> >     |-|-|-|-|-|-|- 28.39% shmem_file_write_iter
> >     |-|-|-|-|-|-|-|- 24.62% generic_perform_write
> >     |-|-|-|-|-|-|-|-|- 18.75% __pi_memmove
> >
> > 3. splice() requires one end to be a pipe, incompatible with regular files or
> dmabuf.
> >
> > 4. copy_file_range()
> >    - Blocked by cross-FS restrictions (Amir's commit 868f9f2f8e00)
> >    - Even without this restriction, Even without restrictions, implementing
> >      the copy_file_range callback in dmabuf fops would only allow dmabuf
> read
> > 	 from regular files. This is because copy_file_range relies on
> > 	 file_out->f_op->copy_file_range, which cannot support dmabuf
> write
> > 	 operations to regular files.
> >
> > Test results confirm these limitations:
> > T.J. Mercier's 1G from ext4 on 6.12.20 | read/sendfile (ms) w/ 3 >
> > drop_caches
> > ------------------------|-------------------
> > udmabuf buffer read     | 1210
> > udmabuf direct read     | 671
> > udmabuf buffer sendfile | 1096
> > udmabuf direct sendfile | 2340
> >
> > My 3GHz CPU tests (cache cleared):
> > Method                | alloc | read  | vs. (%)
> > -----------------------------------------------
> > udmabuf buffer read   | 135   | 546   | 180%
> > udmabuf direct read   | 159   | 300   | 99%
> > udmabuf buffer sendfile | 134 | 303   | 100%
> > udmabuf direct sendfile | 141 | 912   | 301%
> > dmabuf buffer read    | 22    | 362   | 119%
> > my patch direct read  | 29    | 265   | 87%
> >
> > My 1GHz CPU tests (cache cleared):
> > Method                | alloc | read  | vs. (%)
> > -----------------------------------------------
> > udmabuf buffer read   | 552   | 2067  | 198%
> > udmabuf direct read   | 540   | 627   | 60%
> > udmabuf buffer sendfile | 497 | 1045  | 100% udmabuf direct sendfile |
> > 527 | 2330  | 223%
> > dmabuf buffer read    | 40    | 1111  | 106%
> > patch direct read     | 44    | 310   | 30%
> >
> > Test observations align with expectations:
> > 1. dmabuf buffer read requires slow CPU copies 2. udmabuf direct read
> > achieves zero-copy but has page retrieval
> >    latency from vaddr
> > 3. udmabuf buffer sendfile suffers CPU copy overhead 4. udmabuf direct
> > sendfile combines CPU copies with frequent DMA
> >    operations due to small pipe buffers 5. dmabuf buffer read also
> > requires CPU copies 6. My direct read patch enables zero-copy with
> > better performance
> >    on low-power CPUs
> > 7. udmabuf creation time remains problematic (as you’ve noted).
> >
> >>> My focus is enabling dmabuf direct I/O for [regular file] <--DMA-->
> >>> [dmabuf] zero-copy.
> >>
> >> Yeah and that focus is wrong. You need to work on a general solution
> >> to the issue and not specific to your problem.
> >>
> >>> Any API achieving this would work. Are there other uAPIs you think
> >>> could help? Could you recommend experts who might offer suggestions?
> >>
> >> Well once more: Either work on sendfile or copy_file_range or
> >> eventually splice to make it what you want to do.
> >>
> >> When that is done we can discuss with the VFS people if that approach
> >> is feasible.
> >>
> >> But just bypassing the VFS review by implementing a DMA-buf specific
> >> IOCTL is a NO-GO. That is clearly not something you can do in any way.
> > [wangtao] The issue is that only dmabuf lacks Direct I/O zero-copy
> > support. Tmpfs/shmem already work with Direct I/O zero-copy. As
> > explained, existing syscalls or generic methods can't enable dmabuf
> > direct I/O zero-copy, which is why I propose adding an IOCTL command.
> >
> > I respect your perspective. Could you clarify specific technical
> > aspects, code requirements, or implementation principles for modifying
> > sendfile() or copy_file_range()? This would help advance our discussion.
> >
> > Thank you for engaging in this dialogue.
> >
> >>
> >> Regards,
> >> Christian.



More information about the dri-devel mailing list