New subsystem for acceleration devices

Jason Gunthorpe jgg at nvidia.com
Thu Aug 4 14:50:29 UTC 2022


On Thu, Aug 04, 2022 at 10:43:42AM +0300, Oded Gabbay wrote:

> After all, memory management services, or common device chars handling
> I can get from other subsystems (e.g. rdma) as well. I'm sure I could
> model my uAPI to be rdma uAPI compliant (I can define proprietary uAPI
> there as well), but this doesn't mean I belong there, right ?

You sure can, but there is still an expectation, eg in RDMA, that your
device has a similarity to the established standards (like roce in
habana's case) that RDMA is geared to support.

I think the the most important thing to establish a new subsystem is
to actually identify what commonalities it is supposed to be
providing. Usually this is driven by some standards body, but the
AI/ML space hasn't gone in that direction at all yet.

We don't need a "subsystem" to have a bunch of drivers expose chardevs
with completely unique ioctls.

The flip is true of DRM - DRM is pretty general. I bet I could
implement an RDMA device under DRM - but that doesn't mean it should
be done.

My biggest concern is that this subsystem not turn into a back door
for a bunch of abuse of kernel APIs going forward. Though things are
better now, we still see this in DRM where expediency or performance
justifies hacky shortcuts instead of good in-kernel architecture. At
least DRM has reliable and experienced review these days.

Jason


More information about the dri-devel mailing list