[Mesa-dev] [Piglit] [PATCH 1/2] egl: Add sanity test for EGL_EXT_device_query (v3)

Tue Sep 6 11:29:01 UTC 2016

[moving to mesa-dev, adding the EGL device spec authors for their input]

On 5 September 2016 at 08:48, Mathias Fröhlich
<Mathias.Froehlich at gmx.net> wrote:
> On Friday, 2 September 2016 14:02:07 CEST Emil Velikov wrote:
>
>> On 2 September 2016 at 07:15, Mathias Fröhlich
>
>> <Mathias.Froehlich at gmx.net> wrote:
>>
>> >
>> > Great!
>> >
>> > One question that I cannot forsee from your branch:
>> >
>> > The EGL_EXT_device_enumeration spec says
>> >
>> >
>> >
>> > [...] All implementations must support
>> >
>> > at least one device.
>> >
>> > [...]
>> >
>> >
>> >
>> > Which means to me that once an application sucsessfully asked for
>> > EGL_EXT_device_query, this calling application can rely on recieving at
>> > least one usable(?) EGL device. As a last resort, that single guaranteed
>> > device can be a software renderer, but the application gets at least
>> > something that lets it render pictures in some way.
>> >
>> Yes we do need at least one device, which (modulo a few small changes)
>> is applicable with the above branch. There is no need for the single
>> guaranteed device to be software renderer.
>
> Well, how are you getting this single (drm) device when you are on a board
> with a pure framebuffer console on some simple VGA hardware just sufficient
> to bring up the boot screen?
>
>
> This situation is very common on some sort of modern systems. See below.
>

>> > Sure, the intent of the extension is to privide access to hw backed
>> > implementations.
>> >
>> Fully agree.
>>
>> >
>> >
>> > For us it means that we need to provide a software rendering context for
>> > the
>> > case that there is either no drm capable graphics driver.
>> I'm missing something here - barring the vendor neutrial EGL
>> requirement for EGL_EXT_device_base how is the presence or absence of
>> the device extensions going to affect any of your work.
>> Afaict all of them are simply not applicable in the software renderer
>> case.
>
> Now I am confused, what do you mean with 'your (my) work'?
>
>
>
> What I mean here - putting together what I read in the branch:
>
> On compile time of mesa, libdrm is there and usable, so lib EGL announces
> EGL_EXT_device_enumeration so eglQueryDevicesEXT shall be there and return
> at least one device. Now put that mesa libraries onto a fresh installed
> cluster system (just by installing a linux distribution that contains the
> mentioned precompield mesa package). That cluster node I mean has nothing
> drm capable as its head never faces a console user appart from the operator
> seeing the boot screen at most once, if not even that is automated away with
> a kickstart install via network.
>
> How are you going to handle this situation?
>
>
>
> Of course a typical installation out there has selected nodes installed with
> a/several GPU(s) each. This gpu is supposed to be used for producing
> visualization results of your simulation. No monitors attached, just to
> reduce the usually huge amount of simulation data (up to several terrabytes
> or even more) to something that you can actually download to your computer
> which is several thousand pictures (well more similar use cases but all
> share the property that you do not want to copy the simulatoin data but you
> can copy picture data in some sense). Sure on this node you expect
> EGL_EXT_device_enumeration to deliver a gpu and I would hope that we
> (mesa/oss graphics stack) also want to deliver EGL_EXT_platform_device where
> you can make use of that single EGLDevice to grab an EGLDisplay.
>
>
>
> In reality today, such cluster nodes are equipped exclusively with nvidia
> cards and the binary blob. VirtualGL is installed running a totally open X
> server that delivers the application local gl contexts via the binary blob
> through virtualgl. So having EGL_EXT_platform_device together with
> EGL_EXT_device_query is what those software vendors that have understood the
> security implications with VirtualGL will use in the future. Those vendors
> who do not (want to) think about security implications will probably
> continue to use the above virtualgl setup as this does not require any
> invest for changes in their software.
>

>> > Or an even more
>> > nasty case, when the device node is just not accessible by the user. I
>> > have
>> > seen distros that restrict the permissions of the render node devices to
>> > the
>> > user loged in the running X server. So, even if there is hardware that
>> > you
>> > could potentially use, you may not be able to access it.
>> >
>> The libdrm helper provides a list of devices which have at least one
>> node available - be that card, control or render. For the purposes of
>> EGL_EXT_device_drm we could consider the card or render, although the
>> card one is exposed in pretty much all the open-source drivers and is
>> independent of the kernel age.
>>
>> That said if distributions restricts permissions to all of those then
>> ... I'm inclined to go with Distribution/User Error. Then again please
>> poke us if you see such cases.
>
> Fedora 24 that I use to write this mail on is such an example. And yes this
> is kind of a different topic, but one that we (mesa) has to cope with I
> think. And yes I know where redhats bug tracker is.
>

>> > ... remember, the major intent of this set of extensions is to provide
>> > applications with an off screen rendering context for the case where you
>> > do
>> > not have a local X/wayland/whatnotdisplayserver running.
>> >
>> Note: it's not display server but platform ;-) One could use
>> EGL_KHR_platform_gbm if their graphics card vendor implements the
>> extension.
>
> Ok, I call it platform then.
>

>> > Alternatively to providing a cpu rasterizer as a fallback, we could
>> > supress
>> > announcing the EXT_device_enumeration extension if there is no hw backed
>> > driver available. And in turn EGL_EXT_device_base which depends on
>> > EGL_EXT_device_query.
>> >
>> As above - modulo a few small changes this is what the current branch
>> does.
>>
>> > That would at least require some infrastructure to dynamically enable
>> > client
>> > extensions. It would still be unclear what to do then when the render
>> > nodes
>> > seem accessible when initializing/enumerating but the permissions change
>> > until the application wants to create the display using
>> > eglGetPlatformDisplayEXT later.
>> >
>> I'm not currently sold whether the card or render node should be
>> exposed via the EGL_EXT_device_drm extension. In the case of the
>> latter we could easily not advertise EGL_EXT_device_base if there's no
>> such devices available.
>> That will be as a workaround the eglQueryDeviceStringEXT should return
>> EGL_BAD_DEVICE_EXT or EGL_BAD_PARAMETER in case of error text.
>>
>> Unless we add another extension to elaborate/handle things
>> better/differently.
>
> Render nodes please. The point is to gain a hw context *without* a platform
> that is bound to any kind of monitor/console/virtual machine monitor....
>
> The really usefull appication is *non* *interactive* usage.
>
>
>> > What are your plans?
>> >
>> So in general I'm leaning towards:
>> - parties interested in using EGL device without hardware device -
>> we don't expose it, unless we have an extension how it should work
>> - distributions explicitly restricting access to drm devices - we
>> don't expose it, they get what they are asking.
>> - the vendor neutral EGL requirement of EGL_EXT_device_base - I'm
>> leaning that we should rework that so that vendor implementations
>> supporting software rendering will continue to work.
>
>
>
> That's sounds plausible to me.
>
> I just do not see that the device query extension is suppressed when no
> usable render node or equivalent is available.
>

>> Thanks for bringing this up. I (un)fortunately forgot that software
>> rendering and EGL is a thing :-)
>
> Well, I was also picking up this work last autumn until I realised that
>
> 'returns at least one device' part. And I never got around to either
> introduce
>
> runtime disabled client extensions or a software rasterizer backed device.
>
>
>
> Shall we move this to mesa-devel?
>
Good idea.

Since this is evolving/has evolved into a mega-thread, I'd suggest
diving things and tacking independently.
Ideally moving each topic into separate thread with the irrelevant text trimmed.

 * SW EGL implementations - do we have any vendors/implementations
apart from Mesa ?

 * Interaction of ^^ with EGL device extension(s) - update existing
extensions/introduce new ones
 ** Should EGL_EXT_device_enumeration expose one/multiple SW devices
 - no: we need alternative glvnd EGL interface for such cases
 - yes: implementing EGL_EXT_output_drm on EGL implementations
supporting both HW and SW devices is close to impossible barring spec
update

 ** EGL_EXT_output_drm
 *** Using/exposing the card or render node
 - Extension is designed with EGL streams in mind (using the
primary/card node) while people expect to use to select the rendering
device.
 - Elaborate on the spec and/or introduce EGL_EXT_output{,_drm}_render ?
 *** Exposing EGL_EXT_output{,_drm}{,_render} on EGL implementations
supporting both SW and HW devices
 - Elaborate on the spec(s), add new one for SW devices and/or error
type to distinguish between the current errors and SW devices

 * Systems with fb only, disabled render nodes and/or alike.
EGL implementations (in our case the libdrm API provides all the info
about available DRM devices) can effectively detect the presence of
HW/SW devices and expose relevant extensions.
Note: The presence does not and _cannot_ imply that one will always
succeed using each device.

Thanks
Emil