[PATCH v4 1/7] accel/ivpu: Introduce a new DRM driver for Intel VPU

Daniel Vetter daniel at ffwll.ch
Fri Jan 6 09:28:15 UTC 2023


On Thu, Jan 05, 2023 at 07:38:26PM +0200, Oded Gabbay wrote:
> On Thu, Jan 5, 2023 at 6:25 PM Jeffrey Hugo <quic_jhugo at quicinc.com> wrote:
> >
> > On 1/5/2023 5:57 AM, Daniel Vetter wrote:
> > > On Thu, Dec 08, 2022 at 12:07:27PM +0100, Jacek Lawrynowicz wrote:
> > >> +static const struct drm_driver driver = {
> > >> +    .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL,
> > >
> > > So I was wondering whether this is a bright idea, and whether we shouldn't
> > > just go ahead and infuse more meaning into accel vs render nodes.
> > >
> > > The uapi relevant part of render nodes is that they're multi-user safe, at
> > > least as much as feasible. Every new open() gives you a new private
> > > accelerator. This also has implications on how userspace drivers iterate
> > > them, they just open them all in turn and check whether it's the right
> > > one - because userspace apis allow applications to enumerate them all.
> > > Which also means that any devicie initialization at open() time is a
> > > really bad idea.
> > >
> > > A lot of the compute accelerators otoh (well habanalabs) are single user,
> > > init can be done at open() time because you only open this when you
> > > actually know you're going to use it.
> > >
> > > So given this, shouldn't multi-user inference engines be more like render
> > > drivers, and less like accel? So DRIVER_RENDER, but still under
> > > drivers/accel.
> > >
> > > This way that entire separate /dev node would actually become meaningful
> > > beyond just the basic bikeshed:
> > > - render nodes are multi user, safe to iterate and open() just for
> > >    iteration
> > > - accel nodes are single user, you really should not ever open them unless
> > >    you want to use them
> > >
> > > Of course would need a doc patch :-)
> > >
> > > Thoughts?
> > > -Daniel
> >
> > Hmm.
> >
> > I admit, I thought DRIVER_ACCEL was the same as DRIVER_RENDER, except
> > that DRIVER_ACCEL dropped the "legacy" dual node setup and also avoided
> > "legacy" userspace.
> >
> > qaic is multi-user.  I thought habana was the same, at-least for
> > inference.  Oded, am I wrong?
> Habana's devices support a single user at a time acquiring the device
> and working on it.
> Both for training and inference.
> >
> > So, if DRIVER_ACCEL is for single-user (training?), and multi-user ends
> > up in DRIVER_RENDER, that would seem to mean qaic ends up using
> > DRIVER_RENDER and not DRIVER_ACCEL.  Then qaic ends up over under
> > /dev/dri with both a card node (never used) and a render node.  That
> > would seem to mean that the "legacy" userspace would open qaic nodes by
> > default - something I understood Oded was trying to avoid.
> >
> > If there really a usecase for DRIVER_ACCEL to support single-user?  I
> > wonder why we can't default to multi-user, and if a particular
> > user/driver has a single-user usecase, it enforces that in a driver
> > specific manner?
> >
> > -Jeff
> 
> Honestly, Daniel, I don't like this suggestion. I don't understand why
> you make a connection between render/accel to single/multi user.
> 
> As Jeff has said, one of the goals was to expose accelerator devices
> to userspace with new device char nodes so we won't be bothered by
> legacy userspace graphics software. This is something we all agreed on
> and I don't see why we should change it now, even if you think it's
> bike-shedding (which I disagree with).
> 
> But in any case, creating a new device char nodes had nothing to do
> with whether the device supports single or multi user. I can
> definitely see in the future training devices that support multiple
> users.
> 
> The common drm/accel ioctls should of course not be limited to a
> single user, and I agree with Jeff here, if a specific driver has such
> a limitation (e.g. Habana), then that driver should handle it on its
> own.
> Maybe if there will be multiple drivers with such a limitation, we can
> make that "handling" to be common code.
> 
> Bottom line, I prefer we keep things as we all agreed upon in LPC.

The problem is going to happen as soon as you have cross-vendor userspace.
Which I'm kinda hoping is at least still the aspiration. Because with
cross-vendor userspace you generally iterate & open all devices before you
select the one you're going to use. And so we do kinda need a distinction,
or we need that the single-user drivers also guarantee that open() is
cheap.

With just habana and vpu there's no problem because you'll never see
these in the same machine.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list