New subsystem for acceleration devices

Oded Gabbay oded.gabbay at gmail.com
Thu Aug 4 17:53:06 UTC 2022


On Thu, Aug 4, 2022 at 6:04 PM Jeffrey Hugo <quic_jhugo at quicinc.com> wrote:
>
> On 8/4/2022 6:00 AM, Tvrtko Ursulin wrote:
> >
> > On 04/08/2022 00:54, Dave Airlie wrote:
> >> On Thu, 4 Aug 2022 at 06:21, Oded Gabbay <oded.gabbay at gmail.com> wrote:
> >>>
> >>> On Wed, Aug 3, 2022 at 10:04 PM Dave Airlie <airlied at gmail.com> wrote:
> >>>>
> >>>> On Sun, 31 Jul 2022 at 22:04, Oded Gabbay <oded.gabbay at gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Hi,
> >>>>> Greg and I talked a couple of months ago about preparing a new accel
> >>>>> subsystem for compute/acceleration devices that are not GPUs and I
> >>>>> think your drivers that you are now trying to upstream fit it as well.
> >>>>
> >>>> We've had some submissions for not-GPUs to the drm subsystem recently.
> >>>>
> >>>> Intel GNA, Intel VPU, NVDLA, rpmsg AI processor unit.
> >>>>
> >>>> why is creating a new subsystem at this time necessary?
> >>>>
> >>>> Are we just creating a subsystem to avoid the open source userspace
> >>>> consumer rules? Or do we have some concrete reasoning behind it?
> >>>>
> >>>> Dave.
> >>>
> >>> Hi Dave.
> >>> The reason it happened now is because I saw two drivers, which are
> >>> doing h/w acceleration for AI, trying to be accepted to the misc
> >>> subsystem.
> >>> Add to that the fact I talked with Greg a couple of months ago about
> >>> doing a subsystem for any compute accelerators, which he was positive
> >>> about, I thought it is a good opportunity to finally do it.
> >>>
> >>> I also honestly think that I can contribute much to these drivers from
> >>> my experience with the habana driver (which is now deployed in mass at
> >>> AWS) and contribute code from the habana driver to a common framework
> >>> for AI drivers.
> >>
> >> Why not port the habana driver to drm now instead? I don't get why it
> >> wouldn't make sense?
> >>
> >> Stepping up to create a new subsystem is great, but we need rules
> >> around what belongs where, we can't just spawn new subsystems when we
> >> have no clear guidelines on where drivers should land.
> >>
> >> What are the rules for a new accel subsystem? Do we have to now
> >> retarget the 3 drivers that are queued up to use drm for accelerators,
> >> because 2 drivers don't?
> >
> > Isn't there three on the "don't prefer drm" side as well? Habana,
> > Toshiba and Samsung? Just so the numbers argument is not misrepresented.
> > Perhaps a poll like a) prefer DRM, b) prefer a new subsystem, c) don't
> > care in principle; is in order?
>
> I'll chime in with my opinions.  Take them for what you will.
>
> I would say I fall into the C category, but I'm targeting DRM and will
> be the 5th(?) accel device to do so.
>
> I'll say that the ksummit (from what I see in the LWN article) made me
> very happy.  Finally, the community had clear rules for accel drivers.
> When I targeted misc in the past, it seemed like Greg moved the goal
> post just for me, which stalled our attempt.  It was even more
> frustrating to see that the high bar Greg set for us was not applied to
> other devices of the same "class" in following submissions.
>
> However, the past is the past, and based on ksummit, we've spent a
> number of months retargeting DRM.  In a week (or two), I plan to post
> something to start up the discussions again.
>
> As far as the DRM userspace requirements, unless we've misunderstood
> something, they've been easier to satisfy (pending review I suppose)
> than what misc has set.
I think it is quite the opposite. In misc originally there was very
minimal userspace requirements, but when my driver started to use
dma-buf, Dave asked for more.
e.g. a driver that wants to get accepted to DRM and use a fork of LLVM
must not only open-source his code, but also to upstream his fork to
the mainline LLVM tree. In misc there is nothing that closely comes to
that requirement afaik.
>
> I would say that Dave Airlie's feedback on this discussion resonates
> with me.  From the perspective of a vendor wanting to be a part of the
> community, clear rules are important and ksummit seemed to set that.
> Oded's announcement has thrown all of that into the wind.  Without a
That wasn't my intention. I simply wanted to:
1. Offload Greg with these types of drivers.
2. Offer to the new drivers a standard char device handling
3. Start a community of kernel hackers that are writing device drivers
for compute accelerators.

> proposal to evaluate (eg show me the code with clear guidelines), I
> cannot seriously consider Oded's idea, and I'm not sure I want to sit by
> another few years to see it settle out.
I thought of posting something quick (but not dirty) but this backlash
has made me rethink that.

>
> I expect to move forward with what we were planning prior to seeing this
> thread which is targeting DRM.  We'll see what the DRM folks say when
> they have something to look at.  If our device doesn't fit in DRM per an
> assessment of the DRM folks, then I sure hope they can suggest where we
> do fit because then we'll have tried misc and DRM, and not found a home.
>   Since "drivers/accel" doesn't exist, and realistically won't for a
> long time if ever, I don't see why we should consider it.
>
> Why DRM?  We consume dma_buf and might look to p2pdma in the future.
> ksummit appears clear - we are a DRM device.  Also, someone could
> probably run openCL on our device if they were so inclined to wire it
> up.  Over time, I've come to the thinking that we are a GPU, just
> without display.  Yes, it would have helped if DRM and/or drivers/gpu
> were renamed, but I think I'm past that point.  Once you have everything
> written, it doesn't seem like it matters if the uAPI device is called
> /dev/drmX, /dev/miscX, or /dev/magic.
>
> I will not opine on other devices as I am no expert on them.  Today, my
> opinion is that DRM is the best place for me.  We'll see where that goes.
>
> > More to the point, code sharing is a very compelling argument if it can
> > be demonstrated to be significant, aka not needing to reinvent the same
> > wheel.
> >
> > Perhaps one route forward could be a) to consider is to rename DRM to
> > something more appropriate, removing rendering from the name and
> > replacing with accelerators, co-processors, I don't know... Although I
> > am not sure renaming the codebase, character device node names and
> > userspace headers is all that feasible. Thought to mention it
> > nevertheless, maybe it gives an idea to someone how it could be done.
> >
> > And b) allow the userspace rules to be considered per driver, or per
> > class (is it a gpu or not should be a question that can be answered).
> > Shouldn't be a blocker if it still matches the rules present elsewhere
> > in the kernel.
> >
> > Those two would remove the two most contentions points as far as I
> > understood the thread.
> >
> > Regards,
> >
> > Tvrtko
> >
>


More information about the dri-devel mailing list