[PATCH 1/6] dma-buf: add dynamic DMA-buf handling v13

Fri Jul 19 13:30:34 UTC 2019

On Fri, Jul 19, 2019 at 12:05:36PM +0000, Koenig, Christian wrote:
> Am 19.07.19 um 11:31 schrieb Daniel Vetter:
> > On Fri, Jul 19, 2019 at 09:14:05AM +0000, Koenig, Christian wrote:
> >> Am 19.07.19 um 10:57 schrieb Daniel Vetter:
> >>> On Tue, Jul 16, 2019 at 04:21:53PM +0200, Christian König wrote:
> >>>> Am 26.06.19 um 14:29 schrieb Daniel Vetter:
> >>>> [SNIP]
> >>> Well my mail here preceeded the entire amdkfd eviction_fence discussion.
> >>> With that I'm not sure anymore, since we don't really need two approaches
> >>> of the same thing. And if the plan is to move amdkfd over from the
> >>> eviction_fence trick to using the invalidate callback here, then I think
> >>> we might need some clarifications on what exactly that means.
> >> Mhm, I thought that this was orthogonal. I mean the invalidation
> >> callback for a buffer are independent from how the driver is going to
> >> use it in the end.
> >>
> >> Or do you mean that we could use fences and save us from adding just
> >> another mechanism for the same signaling thing?
> >>
> >> That could of course work, but I had the impression that you are not
> >> very in favor of that.
> > It won't, since you can either use the fence as the invalidate callback,
> > or as a fence (for implicit sync). But not both.
> 
> Why not both? I mean implicit sync is an artifact you need to handle 
> separately anyway.

I guess I was unclear: We need something that does implicit sync and
dynamic migration both. And I think if you do the tricky auto-migrate on
first enable_signaling like amdkfd, then you can't really use the
reservation_object for implicit sync anymore.

> > But I also don't think it's a good idea to have 2 invalidation mechanisms,
> > and since we do have one merged in-tree already would be good to proof
> > that the new one is up to the existing challenge.
> 
> Ok, how to proceed then? Should I fix up the implicit syncing of fences 
> first? I've go a couple of ideas for that as well.
> 
> This way we won't have any driver specific definition of what the fences 
> in a reservation object mean any more.

Yeah I think moving forward with this series here is the best plan we
have. I just think at least a poc that the amdkfd eviction/migration logic
fits into this would be really good. I do think we'll need that anyway for
gpu drivers, there's a lot of noise to switch from supplying the working
set of BO on each CS to something more semi-permanently pinned (i.e. much
more the amdkfd model). Making sure dynamic dma-buf can cope with that
sounds like solid future proofing.

> > For context: I spend way too much time reading ttm, amdgpu/kfd and i915-gem
> > code and my overall impression is that everyone's just running around in
> > opposite directions and it's one huge hairball of a mess. With a pretty
> > even distribution of equally "eek this is horrible" but also "wow this is
> > much better than what the other driver does". So that's why I'm even more
> > on the "are we sure this is the right thing" train.
> 
> Totally agree on that, but we should also not make the mistake we have 
> seen on Windows to try to force all drivers into a common memory management.

Agreed that doesn't work either, see all the dri1 horrors.

> That didn't worked out that well in the end and I would rather go down 
> the route of trying to move logic into separate components and backing 
> off into driver specific logic if we found that common stuff doesn't work.

Yeah I think a bit more reviewing/following each another's stuff, and some
better alignment in the foundation of the underlying design is good. I'm
super stoked about Gerd's series for this reason. I'm also hoping that
hmm_mirror can help a lot for userptr (which is another area that just
freaks me out, and there I think amdgpu has the currently cleanest
approach with the hmm_mirror).

I think generally more helpers and less midlayers should help too. Aiming
for that with atomic was imo the best thing we've done on the display side
since the original kms stuff - the old legacy helpers where really not
modular and easy to extend at all, kinda similar to ttm on the memory
handling side.

I have no idea what best to do with all the existing ttm drivers, since
e.g. if we bake in the "bo lock nests within mmap_sem" rule, like amdgpu
and i915 need (and really it's the only way to make this work), then
everyone else is broken. Broken in the sense of lockdep splats at least,
which is kinda worse than just letting the legacy modeset drivers quietly
pass into the night without touching them.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch