[PATCH v2 0/5] Introduce DMA_HEAP_ALLOC_AND_READ_FILE heap flag
Daniel Vetter
daniel.vetter at ffwll.ch
Wed Jul 31 20:46:22 UTC 2024
On Tue, Jul 30, 2024 at 08:04:04PM +0800, Huan Yang wrote:
>
> 在 2024/7/30 17:05, Huan Yang 写道:
> >
> > 在 2024/7/30 16:56, Daniel Vetter 写道:
> > > [????????? daniel.vetter at ffwll.ch ?????????
> > > https://aka.ms/LearnAboutSenderIdentification?????????????]
> > >
> > > On Tue, Jul 30, 2024 at 03:57:44PM +0800, Huan Yang wrote:
> > > > UDMA-BUF step:
> > > > 1. memfd_create
> > > > 2. open file(buffer/direct)
> > > > 3. udmabuf create
> > > > 4. mmap memfd
> > > > 5. read file into memfd vaddr
> > > Yeah this is really slow and the worst way to do it. You absolutely want
> > > to start _all_ the io before you start creating the dma-buf, ideally
> > > with
> > > everything running in parallel. But just starting the direct I/O with
> > > async and then creating the umdabuf should be a lot faster and avoid
> > That's greate, Let me rephrase that, and please correct me if I'm wrong.
> >
> > UDMA-BUF step:
> > 1. memfd_create
> > 2. mmap memfd
> > 3. open file(buffer/direct)
> > 4. start thread to async read
> > 3. udmabuf create
> >
> > With this, can improve
>
> I just test with it. Step is:
>
> UDMA-BUF step:
> 1. memfd_create
> 2. mmap memfd
> 3. open file(buffer/direct)
> 4. start thread to async read
> 5. udmabuf create
>
> 6 . join wait
>
> 3G file read all step cost 1,527,103,431ns, it's greate.
Ok that's almost the throughput of your patch set, which I think is close
enough. The remaining difference is probably just the mmap overhead, not
sure whether/how we can do direct i/o to an fd directly ... in principle
it's possible for any file that uses the standard pagecache.
-Sima
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the dri-devel
mailing list