[Intel-gfx] [PATCH 1/6] dma-buf: add dynamic DMA-buf handling v12

Wed Jun 26 10:57:09 UTC 2019

On Wed, Jun 26, 2019 at 11:28 AM Koenig, Christian
<Christian.Koenig at amd.com> wrote:
>
> Am 26.06.19 um 10:17 schrieb Daniel Vetter:
> > On Wed, Jun 26, 2019 at 09:49:03AM +0200, Christian König wrote:
> >> Am 25.06.19 um 18:05 schrieb Daniel Vetter:
> >>> On Tue, Jun 25, 2019 at 02:46:49PM +0200, Christian König wrote:
> >>>> On the exporter side we add optional explicit pinning callbacks. If those
> >>>> callbacks are implemented the framework no longer caches sg tables and the
> >>>> map/unmap callbacks are always called with the lock of the reservation object
> >>>> held.
> >>>>
> >>>> On the importer side we add an optional invalidate callback. This callback is
> >>>> used by the exporter to inform the importers that their mappings should be
> >>>> destroyed as soon as possible.
> >>>>
> >>>> This allows the exporter to provide the mappings without the need to pin
> >>>> the backing store.
> >>>>
> >>>> v2: don't try to invalidate mappings when the callback is NULL,
> >>>>       lock the reservation obj while using the attachments,
> >>>>       add helper to set the callback
> >>>> v3: move flag for invalidation support into the DMA-buf,
> >>>>       use new attach_info structure to set the callback
> >>>> v4: use importer_priv field instead of mangling exporter priv.
> >>>> v5: drop invalidation_supported flag
> >>>> v6: squash together with pin/unpin changes
> >>>> v7: pin/unpin takes an attachment now
> >>>> v8: nuke dma_buf_attachment_(map|unmap)_locked,
> >>>>       everything is now handled backward compatible
> >>>> v9: always cache when export/importer don't agree on dynamic handling
> >>>> v10: minimal style cleanup
> >>>> v11: drop automatically re-entry avoidance
> >>>> v12: rename callback to move_notify
> >>>>
> >>>> Signed-off-by: Christian König <christian.koenig at amd.com>
> >>> One thing I've forgotten, just stumbled over ttm_bo->moving. For pinned
> >>> buffer sharing that's not needed, and I think for dynamic buffer sharing
> >>> it's also not going to be the primary requirement. But I think there's two
> >>> reasons we should maybe look into moving that from ttm_bo to resv_obj:
> >> That is already part of the resv_obj. The difference is that radeon is
> >> overwriting the one in the resv_obj during CS while amdgpu isn't.
> > I'm confused here: Atm ->moving isn't in resv_obj, there's only one
> > exclusive fence. And yes you need to set that every time you do a move
> > (because a move needs to be pretty exclusive access). But I'm not seeing a
> > separate not_quite_exclusive fence slot for moves.
>
> Yeah, but shouldn't that be sufficient? I mean why does somebody else
> than the exporter needs to know when a BO is moving?

I think for buffer sharing there's not much use for this, but it
sounded like you want to use ->move_notify also more internally. And
in that case, for vk, you want to be able to ignore the implicit
fences as much as possible. But you can't ignore the buffer moves ofc.
Hence tracking those separate could be useful.

amdgpu seems to be solving this internally by never attaching an
exclusive fence for implicit stuff, or something like that, except
when it's shared. But in general you need to assume a funky mix of
implicit and explicit sync'ed workloads, and for those tracking the
moves separately would be good.

> >> So for amdgpu we keep an extra copy in ttm_bo->moving to keep the page fault
> >> handler from unnecessary waiting for a fence in radeon.
> > Yeah that's the main one. The other is in CS (at least for i915) we could
> > run pipeline texture uploads in parallel with other rendering and stuff
> > like that (with multiple engines, which atm is also not there yet). I
> > think that could be somewhat useful for vk drivers.
> >
> > Anyway, totally not understand what you wanted to tell me here in these
> > two lines.
>
> Sorry it's 33C in my home office here and I mixed up radeon/amdgpu in
> the sentence above.

Same here, and yeah makes more sense.

> >>> - You sound like you want to use this a lot more, even internally in
> >>>     amdgpu. For that I do think the sepearate dma_fence just to make sure
> >>>     the buffer is accessible will be needed in resv_obj.
> >>>
> >>> - Once we have ->moving I think there's some good chances to extract a bit
> >>>     of the eviction/pipeline bo move boilerplate from ttm, and maybe use it
> >>>     in other drivers. i915 could already make use of this in upstream, since
> >>>     we already pipeline get_pages and clflush of buffers. Ofc once we have
> >>>     vram support, even more useful.
> >> I actually indeed wanted to add more stuff to the reservation object
> >> implementation, like finally cleaning up the distinction of readers/writers.
> > Hm, more details? Not ringing a bell ...
>
> I'm not yet sure about the details either, so please just wait until I
> solved that all up for me first.

Ah is this about amdgpu doing something else for implicit sync than
what's supposed to be done, and a bit a mismatch when you deal with
shared buffers?

> >> And cleaning up the fence removal hack we have in the KFD for freed up BOs.
> >> That would also allow for getting rid of this in the long term.
> > Hm, what's that for?
>
> When the KFD frees up memory it removes their eviction fence from the
> reservation object instead of setting it as signaled and adding a new
> one to all other used reservation objects.

Oh so just a fast-path for destryoing memory that's in-flight in some move?
-Daniel

>
> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>> And doing that slight semantic change is much easier once we only have a
> >>> few dynamic exporters/importers. And since it's a pure opt-in optimization
> >>> (you can always fall back to the exclusive fence) it should be easy to
> >>> roll out.
> >>>
> >>> Thoughts about moving ttm_bo->moving to resv_obj? Ofc strictly only as a
> >>> follow up. Plus maybe with a clearer name :-)
> >>>
> >>> Cheers, Daniel
> >>>
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch