[RFC PATCH v2 0/3] RDMA: add dma-buf support

Jason Gunthorpe jgg at ziepe.ca
Thu Jul 2 13:29:53 UTC 2020


On Thu, Jul 02, 2020 at 03:10:00PM +0200, Daniel Vetter wrote:
> On Wed, Jul 01, 2020 at 02:15:24PM -0300, Jason Gunthorpe wrote:
> > On Wed, Jul 01, 2020 at 05:42:21PM +0200, Daniel Vetter wrote:
> > > > >> All you need is the ability to stop wait for ongoing accesses to end and
> > > > >> make sure that new ones grab a new mapping.
> > > > > Swap and flush isn't a general HW ability either..
> > > > >
> > > > > I'm unclear how this could be useful, it is guarenteed to corrupt
> > > > > in-progress writes?
> > > > >
> > > > > Did you mean pause, swap and resume? That's ODP.
> > > >
> > > > Yes, something like this. And good to know, never heard of ODP.
> > > 
> > > Hm I thought ODP was full hw page faults at an individual page
> > > level,
> > 
> > Yes
> > 
> > > and this stop&resume is for the entire nic. Under the hood both apply
> > > back-pressure on the network if a transmission can't be received,
> > > but
> > 
> > NIC's don't do stop and resume, blocking the Rx pipe is very
> > problematic and performance destroying.
> > 
> > The strategy for something like ODP is more complex, and so far no NIC
> > has deployed it at any granularity larger than per-page.
> > 
> > > So since Jason really doesn't like dma_fence much I think for rdma
> > > synchronous it is. And it shouldn't really matter, since waiting for a
> > > small transaction to complete at rdma wire speed isn't really that
> > > long an operation. 
> > 
> > Even if DMA fence were to somehow be involved, how would it look?
> 
> Well above you're saying it would be performance destroying, but let's
> pretend that's not a problem :-) Also, I have no clue about rdma, so this
> is really just the flow we have on the gpu side.

I see, no, this is not workable, the command flow in RDMA is not at
all like GPU - what you are a proposing is a global 'stop the whole
chip' Tx and Rx flows for an undetermined time. Not feasible

What we can do is use ODP techniques and pause only the MR attached to
the DMA buf with the process you outline below. This is not so hard to
implement.

> 3. rdma driver worker gets busy to restart rx:
> 	1. lock all dma-buf that are currently in use (dma_resv_lock).
> 	thanks to ww_mutex deadlock avoidance this is possible

Why all? Why not just lock the one that was invalidated to restore the
mappings? That is some artifact of the GPU approach?

And why is this done with work queues and locking instead of a
callback saying the buffer is valid again?

Jason


More information about the dri-devel mailing list