[RFC PATCH v3 1/4] RDMA/umem: Support importing dma-buf as user memory region
Jason Gunthorpe
jgg at ziepe.ca
Mon Oct 5 16:33:04 UTC 2020
On Mon, Oct 05, 2020 at 04:18:11PM +0000, Xiong, Jianxin wrote:
> > The implementation in mlx5 will be much more understandable, it would just do dma_buf_dynamic_attach() and program the XLT exactly
> > the same as a normal umem.
> >
> > The move_notify() simply zap's the XLT and triggers a work to reload it after the move. Locking is provided by the dma_resv_lock. Only a
> > small disruption to the page fault handler is needed.
>
> We considered such scheme but didn't go that way due to the lack of
> notification when the move is done and thus the work wouldn't know
> when it can reload.
Well, the work would block on the reservation lock and that indicates
the move is done
It would be nicer if the dma_buf could provide an op that things are
ready to go though
> Now I think it again, we could probably signal the reload in the page fault handler.
This also works, with a performance cost
> > > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> > > + sgt = dma_buf_map_attachment(umem_dmabuf->attach,
> > > + DMA_BIDIRECTIONAL);
> > > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> >
> > This doesn't look right, this lock has to be held up until the HW is programmed
>
> The mapping remains valid until being invalidated again. There is a
> sequence number checking before programming the HW.
It races, we could immediately trigger invalidation and then
re-program the HW with this stale data.
> > The use of atomic looks probably wrong as well.
>
> Do you mean umem_dmabuf->notifier_seq? Could you elaborate the concern?
It only increments once per invalidation, that usually is racy.
> > > + total_pages = ib_umem_odp_num_pages(umem_odp);
> > > + for_each_sg(umem->sg_head.sgl, sg, umem->sg_head.nents, j) {
> > > + addr = sg_dma_address(sg);
> > > + pages = sg_dma_len(sg) >> page_shift;
> > > + while (pages > 0 && k < total_pages) {
> > > + umem_odp->dma_list[k++] = addr | access_mask;
> > > + umem_odp->npages++;
> > > + addr += page_size;
> > > + pages--;
> >
> > This isn't fragmenting the sg into a page list properly, won't work for unaligned things
>
> I thought the addresses are aligned, but will add explicit alignment here.
I have no idea what comes out of dma_buf, I wouldn't make too many
assumptions since it does have to pass through the IOMMU layer too
> > And really we don't need the dma_list for this case, with a fixed
> > whole mapping DMA SGL a normal umem sgl is OK and the normal umem
> > XLT programming in mlx5 is fine.
>
> The dma_list is used by both "polulate_mtt()" and
> "mlx5_ib_invalidate_range", which are used for XLT programming and
> invalidating (zapping), respectively.
Don't use those functions for the dma_buf flow.
Jason
More information about the dri-devel
mailing list