[Mesa-dev] RFC - libglvnd and GLXVND vendor enumeration to facilitate GLX multi-vendor PRIME GPU offload

Mon Feb 11 21:51:58 UTC 2019

On Fri, Feb 08, 2019 at 03:43:25PM -0700, Kyle Brenneman wrote:
> On 2/8/19 2:33 PM, Andy Ritger wrote:
> > On Fri, Feb 08, 2019 at 03:01:33PM -0500, Adam Jackson wrote:
> > > On Fri, 2019-02-08 at 10:19 -0800, Andy Ritger wrote:
> > > 
> > > > (1) If configured for PRIME GPU offloading (environment variable or
> > > >      application profile), client-side libglvnd could load the possible
> > > >      libGLX_${vendor}.so libraries it finds, and call into each to
> > > >      find which vendor (and possibly which GPU) matches the specified
> > > >      string. Once a vendor is selected, the vendor library could optionally
> > > >      tell the X server which GLX vendor to use server-side for this
> > > >      client connection.
> > > I'm not a huge fan of the "dlopen everything" approach, if it can be
> > > avoided.
> > Yes, I agree.
> I'm pretty sure libglvnd could avoid unnecessarily loading vendor libraries
> without adding nearly so much complexity.
> 
> If libglvnd just has a list of additional vendor library names to try, then
> you could just have a flag to tell libglvnd to check some server string for
> that name before it loads the vendor. If a client-side vendor would need a
> server-side counterpart to work, then libglvnd can check for that. The
> server only needs to keep a list of names to send back, which would be a
> trivial (and backward-compatible) addition to the GLXVND interface.
> 
> Also, even without that, I don't think the extra dlopen calls would be a
> problem in practice. It would only ever happen in applications that are
> configured for offloading, which are (more-or-less by definition)
> heavy-weight programs, so an extra millisecond or so of startup time is
> probably fine.

But why incur that loading if we don't need to?

> > > I think I'd rather have a new enum for GLXQueryServerString
> > > that elaborates on GLX_VENDOR_NAMES_EXT (perhaps GLX_VENDOR_MAP_EXT),
> > > with the returned string a space-delimited list of <profile>:<vendor>.
> > > libGL could accept either a profile or a vendor name in the environment
> > > variable, and the profile can be either semantic like
> > > performance/battery, or a hardware selector, or whatever else.
> > > 
> > > This would probably be a layered extension, call it GLX_EXT_libglvnd2,
> > > which you'd check for in the (already per-screen) server extension
> > > string before trying to actually use.
> > That all sounds reasonable to me.
> > 
> > > > At the other extreme, the server could do nearly all the work of
> > > > generating the possible __GLX_VENDOR_LIBRARY_NAME strings (with the
> > > > practical downside of each server-side GLX vendor needing to enumerate
> > > > the GPUs it can drive, in order to generate the hardware-specific
> > > > identifiers).
> > > I don't think this downside is much of a burden? If you're registering
> > > a provider other than Xorg's you're already doing it from the DDX
> > > driver (I think? Are y'all doing that from your libglx instead?), and
> > > when that initializes it already knows which device it's driving.
> > Right.  It will be easy enough for the NVIDIA X driver + NVIDIA server-side GLX.
> > 
> > Kyle and I were chatting about this, and we weren't sure whether people
> > would object to doing that for the Xorg GLX provider: to create the
> > hardware names, Xorg's GLX would need to enumerate all the DRM devices
> > and list them all as possible <profile>:<vendor> pairs for the Xorg
> > GLX-driven screens.  But, now that I look at it more closely, it looks
> > like drmGetDevices2() would work well for that.
> > 
> > So, if you're not concerned with that burden, I'm not.  I'll try coding
> > up the Xorg GLX part of things and see how it falls into place.
> That actually is one of my big concerns: I'd like to come up with something
> that can give something equivalent to Mesa's existing DRI_PRIME setting, and
> requiring that logic to be in the server seems like a very poor match. You'd
> need to take all of the device selection and enumeration stuff from Mesa and
> transplant it into the Xorg GLX module, and then you'd need to define some
> sort of protocol to get that data back into Mesa where you actually need it.
> Or else you need to duplicate it between the client and server, which seems
> like the worst of both worlds.

Is this actually a lot of code?  I'll try to put together a prototype so
we can see how much it is, but if it is just calling drmGetDevices2() and
then building PCI BusID-based names, that doesn't seem unreasonable to me.

> By comparison, if libglvnd just hands the problem off to the vendor
> libraries, then you could do either. A vendor library could do its device
> enumeration in the client like Mesa does, or it could send a request to
> query something from the server, using whatever protocol you want --
> whatever makes the most sense for that particular driver.
> 
> More generally, I worry that defining a (vendor+device+descriptor) list as
> an interface between libglvnd and the server means baking in a lot of
> unnecessary assumptions and requirements for drivers that we could otherwise
> avoid without losing any functionality.

Is the GLX_VENDOR_MAP_EXT string ajax described that constraining?

> Also, is Mesa the only client-side vendor library that works with the Xorg
> GLX module? I vaguely remember that there was at least one other driver that
> did, but I don't remember the details anymore.
> 
> 
> > 
> > Two follow-up questions:
> > 
> > (1) Even when direct-rendering, NVIDIA's OpenGL/GLX implementation sends
> >      GLX protocol (MakeCurrent, etc).  So, we'd like something client-side
> >      to be able to request that server-side GLXVND route GLX protocol for the
> >      calling client connection to a specific vendor (on a per-screen basis).
> >      Do you think it would be reasonable for GLX_EXT_libglvnd2 to define a
> >      new protocol request, that client-side libglvnd uses, and sends either
> >      the profile or vendor name from the selected '<profile>:<vendor>'?
> > 
> > (2) Who should decide which vendor/gpu gets the semantic name
> >      "performance" or "battery"?  They are relative, so I don't know that
> >      vendors can decide for themselves in isolation.  It kind of feels
> >      like it should be GLXVND's job, but I don't know that it has enough
> >      context to infer.  I'm curious if anyone else has ideas.
> Ultimately, I think this should be up to the user. If you've got a system
> with two high-power GPU's, then both of them would have a reasonable claim
> for "performance," but a user ought to be able to designate one or the
> other.

It should definitely be user configurable, but it would be nice if
we could make a reasonable inference in the case that the user hasn't
provided explicit configuration.

> > Thanks,
> > - Andy
> > 
> > 
> > > - ajax
>