[Mesa-dev] RFC - libglvnd and GLXVND vendor enumeration to facilitate GLX multi-vendor PRIME GPU offload

Wed Feb 13 18:20:58 UTC 2019

On 02/11/2019 02:51 PM, Andy Ritger wrote:
> On Fri, Feb 08, 2019 at 03:43:25PM -0700, Kyle Brenneman wrote:
>> On 2/8/19 2:33 PM, Andy Ritger wrote:
>>> On Fri, Feb 08, 2019 at 03:01:33PM -0500, Adam Jackson wrote:
>>>> On Fri, 2019-02-08 at 10:19 -0800, Andy Ritger wrote:
>>>>
>>>>> (1) If configured for PRIME GPU offloading (environment variable or
>>>>>       application profile), client-side libglvnd could load the possible
>>>>>       libGLX_${vendor}.so libraries it finds, and call into each to
>>>>>       find which vendor (and possibly which GPU) matches the specified
>>>>>       string. Once a vendor is selected, the vendor library could optionally
>>>>>       tell the X server which GLX vendor to use server-side for this
>>>>>       client connection.
>>>> I'm not a huge fan of the "dlopen everything" approach, if it can be
>>>> avoided.
>>> Yes, I agree.
>> I'm pretty sure libglvnd could avoid unnecessarily loading vendor libraries
>> without adding nearly so much complexity.
>>
>> If libglvnd just has a list of additional vendor library names to try, then
>> you could just have a flag to tell libglvnd to check some server string for
>> that name before it loads the vendor. If a client-side vendor would need a
>> server-side counterpart to work, then libglvnd can check for that. The
>> server only needs to keep a list of names to send back, which would be a
>> trivial (and backward-compatible) addition to the GLXVND interface.
>>
>> Also, even without that, I don't think the extra dlopen calls would be a
>> problem in practice. It would only ever happen in applications that are
>> configured for offloading, which are (more-or-less by definition)
>> heavy-weight programs, so an extra millisecond or so of startup time is
>> probably fine.
> But why incur that loading if we don't need to?
As I noted, we can still avoid loading extra loads even with an (almost) 
strictly client-based design. You don't need to do any sort of 
server-based device enumeration, all you need is something in the server 
to add a string to a list that the client can query.

But, there's no reason that query can't be optional, and there's no 
reason it has to be coupled with anything else.

>
>>>> I think I'd rather have a new enum for GLXQueryServerString
>>>> that elaborates on GLX_VENDOR_NAMES_EXT (perhaps GLX_VENDOR_MAP_EXT),
>>>> with the returned string a space-delimited list of <profile>:<vendor>.
>>>> libGL could accept either a profile or a vendor name in the environment
>>>> variable, and the profile can be either semantic like
>>>> performance/battery, or a hardware selector, or whatever else.
>>>>
>>>> This would probably be a layered extension, call it GLX_EXT_libglvnd2,
>>>> which you'd check for in the (already per-screen) server extension
>>>> string before trying to actually use.
>>> That all sounds reasonable to me.
>>>
>>>>> At the other extreme, the server could do nearly all the work of
>>>>> generating the possible __GLX_VENDOR_LIBRARY_NAME strings (with the
>>>>> practical downside of each server-side GLX vendor needing to enumerate
>>>>> the GPUs it can drive, in order to generate the hardware-specific
>>>>> identifiers).
>>>> I don't think this downside is much of a burden? If you're registering
>>>> a provider other than Xorg's you're already doing it from the DDX
>>>> driver (I think? Are y'all doing that from your libglx instead?), and
>>>> when that initializes it already knows which device it's driving.
>>> Right.  It will be easy enough for the NVIDIA X driver + NVIDIA server-side GLX.
>>>
>>> Kyle and I were chatting about this, and we weren't sure whether people
>>> would object to doing that for the Xorg GLX provider: to create the
>>> hardware names, Xorg's GLX would need to enumerate all the DRM devices
>>> and list them all as possible <profile>:<vendor> pairs for the Xorg
>>> GLX-driven screens.  But, now that I look at it more closely, it looks
>>> like drmGetDevices2() would work well for that.
>>>
>>> So, if you're not concerned with that burden, I'm not.  I'll try coding
>>> up the Xorg GLX part of things and see how it falls into place.
>> That actually is one of my big concerns: I'd like to come up with something
>> that can give something equivalent to Mesa's existing DRI_PRIME setting, and
>> requiring that logic to be in the server seems like a very poor match. You'd
>> need to take all of the device selection and enumeration stuff from Mesa and
>> transplant it into the Xorg GLX module, and then you'd need to define some
>> sort of protocol to get that data back into Mesa where you actually need it.
>> Or else you need to duplicate it between the client and server, which seems
>> like the worst of both worlds.
> Is this actually a lot of code?  I'll try to put together a prototype so
> we can see how much it is, but if it is just calling drmGetDevices2() and
> then building PCI BusID-based names, that doesn't seem unreasonable to me.
The fact that it's required *at all* tells you that a server-based 
design doesn't match the reality of existing drivers. I've also seen 
ideas for GLX implementations based on EGL or Vulkan, which probably 
wouldn't be able to work with server-side device enumeration.

And like I pointed out, adding that requirement doesn't give you 
anything that you can't do with a client-based interface.

>
>> By comparison, if libglvnd just hands the problem off to the vendor
>> libraries, then you could do either. A vendor library could do its device
>> enumeration in the client like Mesa does, or it could send a request to
>> query something from the server, using whatever protocol you want --
>> whatever makes the most sense for that particular driver.
>>
>> More generally, I worry that defining a (vendor+device+descriptor) list as
>> an interface between libglvnd and the server means baking in a lot of
>> unnecessary assumptions and requirements for drivers that we could otherwise
>> avoid without losing any functionality.
> Is the GLX_VENDOR_MAP_EXT string ajax described that constraining?
In and of itself, maybe, maybe not. Requiring that it exist in the first 
place (and all of the requirements that it implies) is the problem.

But, something like that might work for a way to let libglvnd filter out 
client-side vendors if we wanted more granularity than just checking for 
the existence of a server-side vendor. In that case, though, the 
requirements are much looser. A list of strings is all you'd need, and a 
vendor only has to use it if it makes sense to do so.

>
>> Also, is Mesa the only client-side vendor library that works with the Xorg
>> GLX module? I vaguely remember that there was at least one other driver that
>> did, but I don't remember the details anymore.
>>
>>
>>> Two follow-up questions:
>>>
>>> (1) Even when direct-rendering, NVIDIA's OpenGL/GLX implementation sends
>>>       GLX protocol (MakeCurrent, etc).  So, we'd like something client-side
>>>       to be able to request that server-side GLXVND route GLX protocol for the
>>>       calling client connection to a specific vendor (on a per-screen basis).
>>>       Do you think it would be reasonable for GLX_EXT_libglvnd2 to define a
>>>       new protocol request, that client-side libglvnd uses, and sends either
>>>       the profile or vendor name from the selected '<profile>:<vendor>'?
>>>
>>> (2) Who should decide which vendor/gpu gets the semantic name
>>>       "performance" or "battery"?  They are relative, so I don't know that
>>>       vendors can decide for themselves in isolation.  It kind of feels
>>>       like it should be GLXVND's job, but I don't know that it has enough
>>>       context to infer.  I'm curious if anyone else has ideas.
>> Ultimately, I think this should be up to the user. If you've got a system
>> with two high-power GPU's, then both of them would have a reasonable claim
>> for "performance," but a user ought to be able to designate one or the
>> other.
> It should definitely be user configurable, but it would be nice if
> we could make a reasonable inference in the case that the user hasn't
> provided explicit configuration.
>
>>> Thanks,
>>> - Andy
>>>
>>>
>>>> - ajax
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev