[PATCH v4 1/4] fs: allow cross-FS copy_file_range for memory file with direct I/O
Amir Goldstein
amir73il at gmail.com
Tue Jun 3 12:43:02 UTC 2025
On Tue, Jun 3, 2025 at 2:38 PM wangtao <tao.wangtao at honor.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Amir Goldstein <amir73il at gmail.com>
> > Sent: Tuesday, June 3, 2025 6:57 PM
> > To: wangtao <tao.wangtao at honor.com>
> > Cc: sumit.semwal at linaro.org; christian.koenig at amd.com;
> > kraxel at redhat.com; vivek.kasireddy at intel.com; viro at zeniv.linux.org.uk;
> > brauner at kernel.org; hughd at google.com; akpm at linux-foundation.org;
> > benjamin.gaignard at collabora.com; Brian.Starkey at arm.com;
> > jstultz at google.com; tjmercier at google.com; jack at suse.cz;
> > baolin.wang at linux.alibaba.com; linux-media at vger.kernel.org; dri-
> > devel at lists.freedesktop.org; linaro-mm-sig at lists.linaro.org; linux-
> > kernel at vger.kernel.org; linux-fsdevel at vger.kernel.org; linux-
> > mm at kvack.org; wangbintian(BintianWang) <bintian.wang at honor.com>;
> > yipengxiang <yipengxiang at honor.com>; liulu 00013167
> > <liulu.liu at honor.com>; hanfeng 00012985 <feng.han at honor.com>
> > Subject: Re: [PATCH v4 1/4] fs: allow cross-FS copy_file_range for memory
> > file with direct I/O
> >
> > On Tue, Jun 3, 2025 at 11:53 AM wangtao <tao.wangtao at honor.com> wrote:
> > >
> > > Memory files can optimize copy performance via copy_file_range callbacks:
> > > -Compared to mmap&read: reduces GUP (get_user_pages) overhead
> > > -Compared to sendfile/splice: eliminates one memory copy -Supports
> > > dma-buf direct I/O zero-copy implementation
> > >
> > > Suggested by: Christian König <christian.koenig at amd.com> Suggested by:
> > > Amir Goldstein <amir73il at gmail.com>
> > > Signed-off-by: wangtao <tao.wangtao at honor.com>
> > > ---
> > > fs/read_write.c | 64 +++++++++++++++++++++++++++++++++++++-----
> > ----
> > > include/linux/fs.h | 2 ++
> > > 2 files changed, 54 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/fs/read_write.c b/fs/read_write.c index
> > > bb0ed26a0b3a..ecb4f753c632 100644
> > > --- a/fs/read_write.c
> > > +++ b/fs/read_write.c
> > > @@ -1469,6 +1469,31 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int,
> > out_fd,
> > > int, in_fd, } #endif
> > >
> > > +static const struct file_operations *memory_copy_file_ops(
> > > + struct file *file_in, struct file *file_out) {
> > > + if ((file_in->f_op->fop_flags & FOP_MEMORY_FILE) &&
> > > + (file_in->f_mode & FMODE_CAN_ODIRECT) &&
> > > + file_in->f_op->copy_file_range && file_out->f_op->write_iter)
> > > + return file_in->f_op;
> > > + else if ((file_out->f_op->fop_flags & FOP_MEMORY_FILE) &&
> > > + (file_out->f_mode & FMODE_CAN_ODIRECT) &&
> > > + file_in->f_op->read_iter && file_out->f_op->copy_file_range)
> > > + return file_out->f_op;
> > > + else
> > > + return NULL;
> > > +}
> > > +
> > > +static int essential_file_rw_checks(struct file *file_in, struct file
> > > +*file_out) {
> > > + if (!(file_in->f_mode & FMODE_READ) ||
> > > + !(file_out->f_mode & FMODE_WRITE) ||
> > > + (file_out->f_flags & O_APPEND))
> > > + return -EBADF;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > /*
> > > * Performs necessary checks before doing a file copy
> > > *
> > > @@ -1484,9 +1509,16 @@ static int generic_copy_file_checks(struct file
> > *file_in, loff_t pos_in,
> > > struct inode *inode_out = file_inode(file_out);
> > > uint64_t count = *req_count;
> > > loff_t size_in;
> > > + bool splice = flags & COPY_FILE_SPLICE;
> > > + const struct file_operations *mem_fops;
> > > int ret;
> > >
> > > - ret = generic_file_rw_checks(file_in, file_out);
> > > + /* The dma-buf file is not a regular file. */
> > > + mem_fops = memory_copy_file_ops(file_in, file_out);
> > > + if (splice || mem_fops == NULL)
> >
> > nit: use !mem_fops please
> >
> > Considering that the flag COPY_FILE_SPLICE is not allowed from userspace
> > and is only called by nfsd and ksmbd I think we should assert and deny the
> > combination of mem_fops && splice because it is very much unexpected.
> >
> > After asserting this, it would be nicer to write as:
> > if (mem_fops)
> > ret = essential_file_rw_checks(file_in, file_out);
> > else
> > ret = generic_file_rw_checks(file_in, file_out);
> >
> Got it. Thanks.
> > > + else
> > > + ret = essential_file_rw_checks(file_in, file_out);
> > > if (ret)
> > > return ret;
> > >
> > > @@ -1500,8 +1532,10 @@ static int generic_copy_file_checks(struct file
> > *file_in, loff_t pos_in,
> > > * and several different sets of file_operations, but they all end up
> > > * using the same ->copy_file_range() function pointer.
> > > */
> > > - if (flags & COPY_FILE_SPLICE) {
> > > + if (splice) {
> > > /* cross sb splice is allowed */
> > > + } else if (mem_fops != NULL) {
> >
> > With the assertion that splice && mem_fops is not allowed if (splice ||
> > mem_fops) {
> >
> > would go well together because they both allow cross-fs copy not only cross
> > sb.
> >
> Git it.
>
> > > + /* cross-fs copy is allowed for memory file. */
> > > } else if (file_out->f_op->copy_file_range) {
> > > if (file_in->f_op->copy_file_range !=
> > > file_out->f_op->copy_file_range) @@ -1554,6
> > > +1588,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> > > ssize_t ret;
> > > bool splice = flags & COPY_FILE_SPLICE;
> > > bool samesb = file_inode(file_in)->i_sb ==
> > > file_inode(file_out)->i_sb;
> > > + const struct file_operations *mem_fops;
> > >
> > > if (flags & ~COPY_FILE_SPLICE)
> > > return -EINVAL;
> > > @@ -1574,18 +1609,27 @@ ssize_t vfs_copy_file_range(struct file *file_in,
> > loff_t pos_in,
> > > if (len == 0)
> > > return 0;
> > >
> > > + if (splice)
> > > + goto do_splice;
> > > +
> > > file_start_write(file_out);
> > >
> >
> > goto do_splice needs to be after file_start_write
> >
> > Please wait for feedback from vfs maintainers before posting another
> > version addressing my review comments.
> >
> Are you asking whether both the goto do_splice and the do_splice label should
> be enclosed between file_start_write and file_end_write?
No I was just wrong please ignore this comment.
Thanks,
Amir.
More information about the dri-devel
mailing list