dma-resv ongoing discussion

Dave Airlie airlied at gmail.com
Mon May 24 02:03:53 UTC 2021


I'd like to try and summarise where I feel we are all at with respect
to the dma-buf discussions. I think I've gotten a fairly good idea of
how things stand but I'm not sure we are really getting to the how to
move things forward stage, where is probably when I need to step in.
Thanks for keeping this as respectful as it has been I understand it
can be difficult. I also think we are starting to find we moved the
knob on driver development happening in company siloes too far with
acceleration features and hopefully with this and TTM work etc we can
start to push back to upstream first designs.

I think Jason[1] summed up my feelings on this the best. We have a
dma-buf inter-driver contract that has a design issue. We didn't fix
that initially, now we have amdgpu as the outlier in a world where
everyone else agreed to the contract.

a) Christian wants to try and move forward with fixing the world of
dma-buf design across all drivers, but hasn't come up with a plan for
doing so apart from amdgpu/i915. I think one strength Daniel has here
is that he's good at coming up with plans that change the ecosystem.
I'd really like to see some concrete effort to work out how much work
fixing this across the ecosystem is and whether it is possible. I
expect Daniel's big huge monster commit message summary of the current
drivers is a great place to start for this. That is if we can agree
dma-buf is broken and what dma-buf should look like tomorrow.

b) Daniel is coming from the side of let's bring amdgpu into the fold
first, then if the problem exists we can move everything forward
together. He intends on pointing out how alone amdgpu is here, and
wants to try and create a uapi that at least mitigates the biggest
problems with moving amdgpu to the common model first. I'd like to
know if this is at least a possibility as an alternate route. I
understand AMD have some goals to reach here but I think we've dug a
massive hole here and paying off the tech debt is going to have to
delay those goals if we are to keep upstream sane.

I'm slowly paging all of the technical details as I go, I'd like to
see more thought around Daniel's idea of fixing the amdgpu oversync
with TLB flushing, as it really doesn't make much sense to be that TLB
flushing on process teardown is going to stall out other processes
using the shared buffer, that it should only stall out moving the
pages. If that then allows aligning amdgpu for now and we can work out
how to fix (a) then that would rock.

Please correct me where I'm wrong here and definitely if I've
misrepresented anyone's positions.

Dave.


[1] https://lore.kernel.org/dri-devel/a1925038-5c3c-0193-1870-27488caa2577@gmail.com/T/#md800f00476ca1869a81b02a28cb2fabc1028c6be


More information about the dri-devel mailing list