Introduction and updates from NVIDIA

Tue May 3 17:10:05 UTC 2016

On Tue, May 03, 2016 at 09:29:58AM -0700, James Jones wrote:
> On 05/03/2016 09:06 AM, Daniel Vetter wrote:
> >On Fri, Apr 29, 2016 at 02:16:28PM -0700, James Jones wrote:
> >>Streams could provide a way to express that the compositor picked the wrong
> >>plane, but they don't solve the optimal configuration problem. Configuration
> >>is a tricky mix of policy and capabilities that something like HWComposer or
> >>a wayland compositor with access to HW-specific knowledge needs to solve.  I
> >>agree with other statements here that encapsulating direct HW knowledge
> >>within individual Wayland compositors is probably not a great idea, but some
> >>separate standard or shared library taking input from hardware-specific
> >>modules and wrangling scene graphs is probably needed to get optimal
> >>behavior.
> >>
> >>What streams do is allow allocating the most optimal set of buffers and
> >>using the most optimal method to present them possible given a
> >>configuration.  So, streams would kick in after the scene graph thing
> >>generated a config.
> >
> >Daniel's reply cut out this crucial bit somehow, and he replied somewhere
> >else that he agrees that eglstreams solves at least the "optimal
> >allocation once scene graph is fixed" problem. I disagree since this
> >entire thing is highly dynamic - at least on SoC chips how you allocate
> >your buffers has big impacts on what the display engine can do, and the
> >other way round:
> >- depending upon tiling layout fifo space requirements change drastically,
> >   and going for the "optimal" tiling might push some other plane over the
> >   edge
> >- there's simpler stuff like some planes can only do some features like
> >   render compression, which is why even for a TEST_ONLY atomic commit you
> >   must supply all the buffers already
> >- other fun stuff happens around rotation/scaling/planar vs. single-plane
> >   yuv buffers. All these tend to need special hw resources, which means
> >   your choice in how to use it on the kms side has effects on what kind of
> >   buffer you need to allocate. And the other way round.
> >
> >I don't think there's any way at all, at least for a generic system that
> >wants to support embedded/mobile SoCs to solve the kms config and buffer
> >alloc problems as 2 separate steps. You absolutely need these two pieces
> >to talk to each another, and talk the same language. Either some vendor
> >horror show (what most of android bsp end up doing behind the back) or
> >something standardized (what we're trying to pull off around kms+gbm).
> >
> >Hiding half of the story behind eglstreams doesn't help anyone afaict. If
> >you do that, you also need to hide the other half. Which means proprietary
> >hw composer driver (or similar), which can understand/change the metadata
> >you internally attach to eglstreams/buffers. And once you've decided to go
> >the fully hidden route hw composer seems to be the best choice really.
> >SurfaceFlinger isn't really great for multi-screen, but the hwc interface
> >itself is already fixed and handles that properly. But even with hwc you
> >don't have eglstreams, because once both ends are proprietary there's
> >really no need for any standard any more ;-)
> >-Daniel
> 
> Thank you for the additional information.  If I follow this correctly:
> 
> 1) You believe HW composer is a reasonable solution to optimizing the
> scenegraph of a compositor.
> 
> 2) Your preferred solution would not be HW composer, but rather GBM+KMS
> (presumably with the addition of some yet-to-be-developed APIs)
> 
> 3) You believe the constraints of the system are sufficiently interdependent
> that to optimize the system, all allocations and display engine
> configuration must be done atomically, in a sense.
> 
> Is that correct?

No to 2) I want to drive kms/gbm towards hwc/gralloc so that at least for
90% of use-cases you can run a generic drm_hwcomposer on top of kms and
generic_gralloc on top of gbm. But there will always be use-cases that
need that last bit of efficiency in some very specific use-case, and for
those hwc+gralloc seem perfectly suited. So no unconditional preference
from my side at all, just a desire to standardize things more, and share
more code across vendors and platforms.

Agreed on 1) & 3).

> If so, I have some questions:
> 
> -Do you believe (2) is reasonably achievable, or just the style of solution
> you would prefer in general?

See above. Aim for 90% percent.

> -Why is GBM+DRM-KMS better suited to meet the requirements presented by (3)
> than EGLStream+DRM-KMS?

It exists and is widely used in shipping open-source systems like CrOS, X,
wayland and whateever else. eglstreams lacks the adoption, both in
compositors and in open source drivers. You could try to fix that by just
writing the eglstreams support for everyone (including mesa drivers and
all the existing compositors people are building), but I don't see that
happening.

> Given Wayland is designed such that clients drive buffer allocation, and I
> tend to agree that the compositor (along with its access to drivers like
> KMS) is the component uniquely able to optimize the scene, I think the best
> that can be achieved is a system that gravitates toward the optimal solution
> in the steady state.  Therefore, it seems that KMS should optimize display
> engine resources assuming the Wayland compositor and its clients will adjust
> to meet KMS' suggestions over time, where "time" would hopefully be only a
> small number of additional frames. Streams will perform quite well in such a
> design.
> 
> There would of course be cases where multiple iterations are required to get
> from the current buffers and their display requirements to the optimal
> buffers and the optimal display settings, but I don't see a way around that.
> Hopefully over time display hardware will be optimized towards these
> use-cases, as they are becoming ubiquitous.

Yeah, that's the idea I have in mind too. Except there's no reason why
you'd hide half of that iterative pipeline improving behind streams in my
opinion. At least if you want to support a semi-generic compositor, which
seems to be the goal you have with eglstreams/egloutput extensions
proposed.

If your goal is simply to etch out the last bit of performance your hw
affords, then we already have hwc+gralloc. It works, and since 1-2 years
google engineers have become very open about extending it and fixing
corner cases to make it more widely suitable.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch