Introduction and updates from NVIDIA

Fri May 13 05:07:10 UTC 2016

On Wed, May 11, 2016 at 4:08 PM, James Jones <jajones at nvidia.com> wrote:
> On 05/11/2016 02:31 PM, Daniel Stone wrote:
>>
>> Hi James,
>>
>> On 11 May 2016 at 21:43, James Jones <jajones at nvidia.com> wrote:
>>>
>>> On 05/04/2016 08:56 AM, Daniel Stone wrote:
>>>>
>>>> Right - but as with the point I was making below, GBM _right now_ is
>>>> more capable than Streams _right now_. GBM right now would require API
>>>> additions to match EGLStreams + EGLSwitch + Streams/KMS-interop, but
>>>> the last two aren't written either, so. (More below.)
>>>
>>>
>>> The current behavior that enables this, where basically all Wayland
>>> buffers
>>> must be allocated as scanout-capable, isn't reasonable on NVIDIA
>>> hardware.
>>> The requirements for scanout are too onerous.
>>
>>
>> I think we're talking past each other, so I'd like to pare the
>> discussion down to these two sentences, and my two resultant points,
>> for now:
>>
>> I posit that the Streams proposal you (plural) have put forward is, at
>> best, no better at meeting these criteria:
>>    - there is currently no support for direct scanout from client
>> buffers in Streams, so it must always pessimise towards GPU
>> composition
>>    - GBM stacks can obviously do the same: implement a no-op
>> gbm_bo_import, and have your client always allocate non-scanout
>> buffers - presto, you've matched Streams
>>
>> I posit that GBM _can_ match the capability of a hypothetical
>> EGLStreams/EGLSwitch implementation. Current _implementations_ of GBM
>> cannot, but I posit that it is not a limitation of the API it exposes,
>> and unlike Streams, the capability can be plumbed in with no new
>> external API required.
>>
>> These seem pretty fundamental, so ... am I missing something? :\ If
>> so, can you please outline fairly specifically how you think
>> non-Streams implementations are not capable of meeting the criteria in
>> your two sentences?
>
>
> I respect the need to rein in the discussion, but I think several
> substantive aspects have been lost here.  I typed up a much longer response
> below, but I'll try to summarize in 4 sentences:
>
> GBM could match the allocation aspects of streams used in Miguel's first
> round of patches.  However, I disagree that its core API is sufficient to
> match the allocation capabilities of EGLStream+EGLSwitch where all producing
> and consuming devices+engines are known at allocation time. Further, streams
> have additional equally valuable functionality beyond allocation that GBM
> does not seem intended to address.  Absent agreement, I believe co-existence
> of EGLStreams and GBM+wl_drm in Wayland/Weston is a reasonable path forward
> in the short term.
>
> The longer version:
>
> GBM alone can not perform as well as EGLStreams unless it is extended into
> something more or less the same as EGLStreams, where it knows exactly what
> engines are being used to produce the buffer content (along with their
> current configuration), and exactly what engines/configuration are being
> used to consume it.  This implies allocating against multiple specific
> objects, rather than a device and a set of allocation modifier flags, and/or
> importing an external allocation and hoping it meets the current
> requirements.  From what I can see, GBM fundamentally understands at most
> the consumer side of the equation.
>
> Suppose however, GBM was taught everything streams know implicitly about all
> users of the buffers at allocation time.  After allocation, GBM is done with
> its job, but streams & drivers aren't.
>
> The act of transitioning a buffer from optimal "producer mode" to optimal
> "consumer mode" relies on all the device & config information as well,
> meaning it would need to be fed into the graphics driver (EGL or whatever
> window system binding is used) by each window system the graphics driver was
> running on to achieve equivalent capabilities to EGLStream.
>
> Fundamentally, the API-level view of individual graphics buffers as raw
> globally coherent & accessible stores of pixels with static layout is
> flawed.  Images on a GPU are more of a mutating spill space for a collection
> of state describing the side effects of various commands than a 2D array of
> pixels.  Forcing GPUs to resolve an image to a 2D array of pixels in any
> particular layout can be very inefficient.  The GL+GLX/EGL/etc. driver model
> hides this well, but it breaks down in a few cases like EGLImage and
> GLX_EXT_texture_from_pixmap, the former not really living up to its implied
> potential because of this, and the latter mostly working only because it has
> a very limited domain where things can be shared, but still requires a lot
> of platform-specific code to support properly.  Vulkan brings a lot more of
> this out into the open with its very explicit image state transitions and
> limitations on which engines can access an image in any given state, but
> that's just within the Vulkan API itself (I.e., strictly on a single GPU and
> optionally an associated display engine within the same driver & process) so
> far.
>
> The EGLStream encapsulation takes into consideration the new use cases
> EGLImage, GBM, etc. were intended to address, and restores what I believe to
> be the minimal amount of the traditional GL+GLX/EGL/etc. model, while still
> allowing as much of the flexibility of the "a bunch of buffers" mental model
> as possible.  We can re-invent that with GBM API adjustments, a set of
> restrictions on how the buffers it allocates can be used, and another layer
> of metadata being pumped into drivers on top of that, but I suspect we'd
> wind up with something that looks very similar to streams.

I think this is where the disconnect is. I (and others) don't see
reinventing some of the EGLStream functionality in gbm + wl_drm (or
similar EGL implementation private protocol) as a problem or that the
result will be worse then EGLStreams. Compositors use gbm today and
would much rather grow one code path incrementally and in a backwards
compatible way. I know that's already been done for various non-mesa
stacks, lilke SoCs where scanout memory is a scarce resource. If we
end up with something similar to what EGLStream will be one day, that
doesn't mean we should've used EGLStreams. It just means they're
different solutions to the same problem.

Kristian

> We're both delving into future developments and hypotheticals to some degree
> here.  If we can't agree now on which direction is best, I believe the right
> solution is to allow the two to co-exist and compete collegially until the
> benefits of one or the other become more apparent.  The Wayland protocol and
> Weston compositor were designed in a manner that makes this as painless as
> possible.  It's not like we're going to get a ton of Wayland clients that
> suddenly rely on EGLStream.  At worst, streams lose out and some dead code
> needs to be deleted from any compositors that adopted them.  As we
> discussed, there is some maintenance cost to having two paths, but I believe
> it is reasonably contained.
>
> Thanks,
> -James
>
>
>> Cheers,
>> Daniel
>>
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/wayland-devel