Introduction and updates from NVIDIA

Tue May 3 16:07:12 UTC 2016

On 04/29/2016 03:07 PM, Daniel Stone wrote:
> Hi James,
>
> On 29 April 2016 at 22:16, James Jones <jajones at nvidia.com> wrote:
>> I was on leave when this discussion was started.  Now that I'm back, I'd
>> like to respond to a few points raised below:
>
> Welcome back!

Thanks!

>> On 03/29/2016 09:44 AM, Daniel Stone wrote:
>>> Right, atomic allows you separate pipe/CRTC configuration from
>>> plane/overlay configuration. So you'd have two options: one is to use
>>> atomic and require the CRTC be configured with planes off before using
>>> Streams to post flips, and the other is to add KMS configuration to
>>> the EGL output.
>>>
>>> Though, now I think of it, this effectively precludes one case, which
>>> is scaling a Streams-sourced buffer inside the display controller. In
>>> the GBM case, the compositor gets every buffer, so can configure the
>>> plane scaling in line with buffer display. I don't see how you'd do
>>> that with Streams.
>>>
>>> There's another hurdle to overcome too, which would currently preclude
>>> avoiding the intermediate dumb buffer at all. One of the invariants
>>> the atomic KMS API enforces is that (!!plane->crtc_id ==
>>> !!plane->fb_id), i.e. that a plane cannot be assigned to a CRTC
>>> without an active buffer. So again, we're left with either having the
>>> plane fully configured and active (assigned to a CRTC and displaying,
>>> I assume, a pre-allocated dumb buffer), or pushing more configuration
>>> into Streams - specifically, connecting an EGLOutputLayer to an
>>> EGLOutputPort.
>>
>> Not having a full mode-setting API within EGL did make this initial
>> configuration chicken-and-egg problem hard to solve.
>>
>> I agree that EGLStreams/EGLOutput should integrate with atomic better than
>> is shown in this initial patchset.
>>
>> Maybe a better way to achieve that would be to give EGL an opportunity to
>> amend an already created atomic request before commiting it?  E.g.,
>>
>>    eglStreamsAcquire(dpy, <listOfStreams>, <atomicRequest>);
>>
>> That would take a filled-out atomic request that does any necessary
>> reconfiguration and just add the new framebuffers to it from
>> <listOfStreams>.  Any planes that don't need a new frame wouldn't be
>> included in <listOfStreams> and would keep their current frame.  Planes
>> could also be turned off, moved, re-scaled, etc.  Whatever atomic can
>> express.
>>
>> Maybe we would need an eglStreamsCheckAcquire/eglStreamsCommitAcquire() to
>> fail and/or hint to the user that the suggested stream+atomic request
>> produces sub-optimal results and should be recreated with more optimal
>> buffers?
>>
>> In any case, the idea should be nothing would limit the atomic API usage
>> just because streams are involved.
>
> That is indeed a possibility, though I'm concerned that it leaks KMS
> atomic details through the Streams API.

The atomic/KMS usage, like the DRM integration itself, would be optional 
though.  This wouldn't, for example, leak KMS details into an 
OpenWF-based EGLOutput+EGLStream application.  I don't see it as any 
worse than having EGL_KHR_platform_x11 and friends, for example.

> Certainly if the check failed,
> you'd need to rewind using the atomic cursor API to be useful. It
> would also complicate the Streams implementation, as you'd need the
> operation to be a 'peek' at the stream head, rather than popping the
> frame for a test, failing, and then blocking waiting for a new frame.
> You'd also need somewhere to store a reference to that frame, so you
> could reuse it later (say you turn the display off and later turn it
> back on).

I believe streams require all this already.  They need to maintain a 
reference to the current frame for re-use if a new one is not available, 
and they need to essentially "peek" at the beginning of an acquire and 
"commit" at the end, so exposing that via the API wouldn't be a large 
change.

> The alternative is, as you allude to, to push the modesetting into
> EGL, so that the application feeds EGL its desired outcome and lets
> EGL determine the optimal configuration, rather than driving the two
> APIs in lockstep.

Indeed.

>>> Well, nowhere. By current plane configuration, I assume you're (to the
>>> extent that you can discuss it) talking about asymmetric plane
>>> capabilities, e.g. support for disjoint colour formats, scaling units,
>>> etc? As Dan V says, I still see Streams as a rather incomplete fix to
>>> this, given that plane assignment is pre-determined: what do you do
>>> when your buffers are configured as optimally as possible, but the
>>> compositor has picked the 'wrong' plane? I really think you need
>>> something like HWC to rewrite your scene graph into the optimal setup.
>>
>> Streams could provide a way to express that the compositor picked the wrong
>> plane, but they don't solve the optimal configuration problem. Configuration
>> is a tricky mix of policy and capabilities that something like HWComposer or
>> a wayland compositor with access to HW-specific knowledge needs to solve.  I
>> agree with other statements here that encapsulating direct HW knowledge
>> within individual Wayland compositors is probably not a great idea, but some
>> separate standard or shared library taking input from hardware-specific
>> modules and wrangling scene graphs is probably needed to get optimal
>> behavior.
>
> Yeah, I would lean towards HWC itself, but that's a separate discussion.
>
>>> Do you see any problem with doing that within GBM? It's not actually
>>> done yet, but then again, neither is direct scanout through Streams.
>>> ;)
>>
>> With new Wayland protocol, patches to all Wayland compositors to send proper
>> hints to clients using this protocol, improvements to GBM, and updates to
>> both of these when new GPU architectures introduced new requirements, what
>> you describe could do anything streams can do. However, then the problem
>> will have been solved only in the context of top-of-tree Wayland and Weston.
>
> This doesn't require explicit/new compositor interaction at all.
> Extensions can be done within the gbm/EGL bundle itself (via
> EGL_WL_bind_wayland_display), so you're only changing one DSO (or DSO
> bundle), and the API usage there today does seem to stand up. Given
> that the protocol is private - I'm certainly not advocating for a
> DRI2-style all-things-to-all-hardware standard protocol to communicate
> this - and that it's localised in a vendor bundle, it seems completely
> widely applicable to me. As someone who's writing this from
> Mutter/Wayland/GBM, I'm certainly not interested in Weston-only
> solutions.

No, the necessary extensions can not be contained within the binding. 
There is not enough information within the driver layer alone. Something 
needs to tell the driver when the configuration changes (E.g., the 
consumer of a wayland surface switches from a texture to a plane) and 
what the new configuration is.  This would trigger the protocol 
notifications & subsequent optimization within the driver.  By the 
nature of their API, streams would require the compositor to take action 
on such configuration changes, and streams can discover the new 
configuration.  Something equivalent would be required to make this work 
in the GBM+wl_drm/EGL case.

Further, as a driver vendor, the idea of requiring even in-driver 
platform-specific modifications for this sounds undesirable.  If it was 
something that could be contained entirely within GBM, that would be 
interesting.  However, distributing the architecture-specific code 
throughout the window-system specific code in the driver means a lot 
more maintenance burden in a world with X, Chrome OS, Wayland, and 
several others.

>> There are far more use cases for streams or similar producer/consumer
>> constructs than Wayland.  Streams allow drivers to solve the problem in one
>> place.
>
> Certainly there are, but then again, there are far more usecases than
> EGL. Looking at media playback, Vulkan, etc, where you don't have EGL
> yet need to solve the same problems.

EGLStreams, Vulkan swapchains, and (for example) VDPAU presentation 
queues are all varying levels of abstraction on top of the same thing 
within the driver: a presentation engine or buffer queue, depending on 
whether the target is a physical output or a compositor.  These 
API-level components can be hooked up to eachother as long as the 
lower-level details are fully contained within the driver abstraction. 
A Vulkan swapchain can be internally implemented as an EGLStream 
producer, for example.  In fact, Vulkan swapchains borrow many ideas 
directly and indirectly from EGLStream.

>> Streams also allow vendors to ship new drivers when new hardware
>> appears that will enable that new hardware to work (and work optimally,
>> scenegraph issues aside) with existing compositors and applications without
>> modification.  That second point is a guiding principle for what should be
>> encapsulated within a driver API Vs. what should be on the application side.
>
> I agree, and I'm not arguing this to be on the application or
> compositor side either. I believe the GBM and HWC suggestions are
> entirely doable, and further that these problems will need to be
> solved outside EGL anyway, for the other usecases. My worry - quite
> aside from how vendors who struggle to produce a conformant EGL 1.4
> implementation today will ever implement the complexity of Streams,
> though this isn't your problem - is that EGL is really the wrong place
> to be solving this.

Could you elaborate on what the other usecases are?  If you mean the 
Vulkan/media playback cases mentioned above, then I don't see what is 
fundamentally wrong about using EGL as a backend within the window 
system for those.  If a Vulkan application needs to display on an 
EGL+GLES-based Wayland compositor, there will be some point where a 
transition is made from Vulkan -> EGL+GLES regardless.

>>> All of gbm.h is user-facing; how you implement that API is completely
>>> up to you, including arbitrary metadata. For instance, it's the driver
>>> that allocates its own struct gbm_surface/gbo_bo/etc (which is
>>> opaque), so it can do whatever it likes in terms of metadata. Is there
>>> anything in particular you're thinking of that you're not sure you'd
>>> be able to store portably?
>>>
>>> Might also be worth striking a common misconception here: the Mesa GBM
>>> implementation is _not_ canonical. gbm.h is the user-facing API you
>>> have to implement, but beyond that, you don't need to be implemented
>>> by Mesa's src/gbm/. As the gbm.h types are all opaque, I'm not sure
>>> what you couldn't express/hide/store - do you have any examples?
>>
>> If we could work out how to install vendor-specific GBM implementations, I
>> believe you're correct, the API is sufficiently high-level to represent our
>> allocation metadata.
>
> After some thought, I've come around to the view that we should
> declare the Mesa implementation and allow others to install plugins.
> The EGLDisplay -> gbm_device bind happens too late to do it otherwise,
> I think.
>
>> Yes, streams introduce a slightly different way of doing things than GBM+the
>> wl_drm protocol.  However, the differences are minimal.  I don't think the
>> patchset Miguel has proposed is that invasive, and as we say, there's
>> nothing preventing Mesa and others from implementing streams as well.
>
> I think it's large enough that it warrants a split of gl-renderer and
> compositor-drm, rather than trying to shoehorn them into the same
> file. There's going to be quite some complexity hiding between the
> synchronise-with-client-event-stream and direct-scanout boxes, that
> will push it over the limit of what's tractable. Those files are
> already pretty huge and complex.

Would it be better to wait until such complexities arise in future 
patches and split the files at that point, or would you prefer we split 
the backends now?  Perhaps I'm just more optimistic about the 
complexity, but it seems like it would be easier to evaluate once that 
currently-hypothetical portion of the code exists.

>> They're part of an open standard, and we'd certainly welcome collaboration
>> on the specifications.  I hope we  can at least consider EGLStreams as a
>> potentially better solution, even if it wasn't the first solution.
>>
>> Further, another thing I'd like to get rid of is "implement the ~25 LoC of
>> libwayland-egl".  Streams let us do that.  I want Wayland support to be the
>> last windowing/compositing system for which driver vendors needs to
>> explicitly maintain support in their code.  Once we clean up & standardize
>> the very minimal driver interfaces beyond current EGL that our
>> libwayland-egl code is using, anyone should be able to write a windowing
>> system and provide hooks to enable any EGL driver supporting the
>> standardized window system hook ABI to run as a client of it.  The same
>> should be done for Vulkan WSI platforms, where the per-platform driver API
>> is already even more self-contained.  In other words, my hope is that
>> Wayland EGL and Vulkan support will soon be something that ships with GLVND
>> and the Vulkan common loader, not with the drivers.
>
> I share the hope, and maybe with the WSI and Streams available, we can
> design future window systems and display control APIs towards
> something like that. But at the moment, the impedance mismatch between
> Streams and the (deliberately very different) Wayland and KMS APIs is
> already fairly glaring. The winsys support is absolutely trivial to
> write, and with winsys interactions only getting more featureful and
> complex, such will the common stream protocol have to be.
>
> If I was starting from the position of the EGL ideal: that everything
> is EGL, and the only external interactions are creating native types
> for it, then I would surely arrive at the same position as you. But
> everything we've seen so far - and again, ChromeOS have taken this to
> a much further extent - has been chipping away at EGL, rather than
> putting more into it, and this has been for the better.

The direction ChromeOS is taking is even more problematic, and I'd hate 
to see it being held up as an example of proper design direction.  We 
spent a good deal of time working with Google to support ChromeOS and 
ended up essentially allowing them to punch through the driver 
abstraction via very opaque EGL extensions that no engineer besides the 
extension authors could be expected to use correctly, and embed 
HW-specific knowledge within some component of ChromeOS, such that it 
will likely only run optimally on a single generation of our hardware 
and will need to be revisited.  That's the type of problem we're trying 
to avoid here.  ChromeOS has made other design compromises that cost us 
(and I suspect other vendors) 10-20% performance across the board to 
optimize for a very specific use case (I.e., a browser) and within very 
constrained schedules.  It is not the right direction for OS<->graphics 
driver interactions to evolve.

> I don't think that's a difference we'll ever resolve though.

I believe thus far we've all tried to focus objectively on specific 
issues, proposed solutions for them, and the merits of those solutions. 
  Weston and the other Wayland compositors I'm aware of are based on EGL 
at the moment, so regardless of its merits as an API it doesn't seem 
problematic purely from a dependency standpoint to add EGLStream as an 
option next to the existing EGLImage and EGLDisplay+GBM paths.  I'm 
certainly willing to continue discussing the merits of EGL on a broader 
scale, but does that discussion need to block the patches proposed here?

Thanks,
-James

> Cheers,
> Daniel
>