[Mesa-dev] GBM and the Device Memory Allocator Proposals

Thu Nov 30 06:28:56 UTC 2017

On 11/29/2017 01:10 PM, Rob Clark wrote:
> On Wed, Nov 29, 2017 at 12:33 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
>> On Sat, Nov 25, 2017 at 1:20 PM, Rob Clark <robdclark at gmail.com> wrote:
>>>
>>> On Sat, Nov 25, 2017 at 12:46 PM, Jason Ekstrand <jason at jlekstrand.net>
>>> wrote:
>>>> On November 24, 2017 09:29:43 Rob Clark <robdclark at gmail.com> wrote:
>>>>>
>>>>>
>>>>> On Mon, Nov 20, 2017 at 8:11 PM, James Jones <jajones at nvidia.com>
>>>>> wrote:
>>>>>>
>>>>>> As many here know at this point, I've been working on solving issues
>>>>>> related
>>>>>> to DMA-capable memory allocation for various devices for some time
>>>>>> now.
>>>>>> I'd
>>>>>> like to take this opportunity to apologize for the way I handled the
>>>>>> EGL
>>>>>> stream proposals.  I understand now that the development process
>>>>>> followed
>>>>>> there was unacceptable to the community and likely offended many great
>>>>>> engineers.
>>>>>>
>>>>>> Moving forward, I attempted to reboot talks in a more constructive
>>>>>> manner
>>>>>> with the generic allocator library proposals & discussion forum at XDC
>>>>>> 2016.
>>>>>> Some great design ideas came out of that, and I've since been
>>>>>> prototyping
>>>>>> some code to prove them out before bringing them back as official
>>>>>> proposals.
>>>>>> Again, I understand some people are growing concerned that I've been
>>>>>> doing
>>>>>> this off on the side in a github project that has primarily NVIDIA
>>>>>> contributors.  My goal was only to avoid wasting everyone's time with
>>>>>> unproven ideas.  The intent was never to dump the prototype code as-is
>>>>>> on
>>>>>> the community and presume acceptance. It's just a public research
>>>>>> project.
>>>>>>
>>>>>> Now the prototyping is nearing completion, and I'd like to renew
>>>>>> discussion
>>>>>> on whether and how the new mechanisms can be integrated with the Linux
>>>>>> graphics stack.
>>>>>>
>>>>>> I'd be interested to know if more work is needed to demonstrate the
>>>>>> usefulness of the new mechanisms, or whether people think they have
>>>>>> value
>>>>>> at
>>>>>> this point.
>>>>>>
>>>>>> After talking with people on the hallway track at XDC this year, I've
>>>>>> heard
>>>>>> several proposals for incorporating the new mechanisms:
>>>>>>
>>>>>> -Include ideas from the generic allocator design into GBM.  This could
>>>>>> take
>>>>>> the form of designing a "GBM 2.0" API, or incrementally adding to the
>>>>>> existing GBM API.
>>>>>>
>>>>>> -Develop a library to replace GBM.  The allocator prototype code could
>>>>>> be
>>>>>> massaged into something production worthy to jump start this process.
>>>>>>
>>>>>> -Develop a library that sits beside or on top of GBM, using GBM for
>>>>>> low-level graphics buffer allocation, while supporting non-graphics
>>>>>> kernel
>>>>>> APIs directly.  The additional cross-device negotiation and sorting of
>>>>>> capabilities would be handled in this slightly higher-level API before
>>>>>> handing off to GBM and other APIs for actual allocation somehow.
>>>>>
>>>>>
>>>>> tbh, I kinda see GBM and $new_thing sitting side by side.. GBM is
>>>>> still the "winsys" for running on "bare metal" (ie. kms).  And we
>>>>> don't want to saddle $new_thing with aspects of that, but rather have
>>>>> it focus on being the thing that in multiple-"device"[1] scenarious
>>>>> figures out what sort of buffer can be allocated by who for sharing.
>>>>> Ie $new_thing should really not care about winsys level things like
>>>>> cursors or surfaces.. only buffers.
>>>>>
>>>>> The mesa implementation of $new_thing could sit on top of GBM,
>>>>> although it could also just sit on top of the same internal APIs that
>>>>> GBM sits on top of.  That is an implementation detail.  It could be
>>>>> that GBM grows an API to return an instance of $new_thing for
>>>>> use-cases that involve sharing a buffer with the GPU.  Or perhaps that
>>>>> is exposed via some sort of EGL extension.  (We probably also need a
>>>>> way to get an instance from libdrm (?) for display-only KMS drivers,
>>>>> to cover cases like etnaviv sharing a buffer with a separate display
>>>>> driver.)
>>>>>
>>>>> [1] where "devices" could be multiple GPUs or multiple APIs for one or
>>>>> more GPUs, but also includes non-GPU devices like camera, video
>>>>> decoder, "image processor" (which may or may not be part of camera),
>>>>> etc, etc
>>>>
>>>>
>>>> I'm not quite some sure what I think about this.  I think I would like
>>>> to
>>>> see $new_thing at least replace the guts of GBM. Whether GBM becomes a
>>>> wrapper around $new_thing or $new_thing implements the GBM API, I'm not
>>>> sure.  What I don't think I want is to see GBM development continuing on
>>>> it's own so we have two competing solutions.
>>>
>>> I don't really view them as competing.. there is *some* overlap, ie.
>>> allocating a buffer.. but even if you are using GBM w/out $new_thing
>>> you could allocate a buffer externally and import it.  I don't see
>>> $new_thing as that much different from GBM PoV.
>>>
>>> But things like surfaces (aka swap chains) seem a bit out of place
>>> when you are thinking about implementing $new_thing for non-gpu
>>> devices.  Plus EGL<->GBM tie-ins that seem out of place when talking
>>> about a (for ex.) camera.  I kinda don't want to throw out the baby
>>> with the bathwater here.
>>
>>
>> Agreed.  GBM is very EGLish and we don't want the new allocator to be that.
>>
>>>
>>> *maybe* GBM could be partially implemented on top of $new_thing.  I
>>> don't quite see how that would work.  Possibly we could deprecate
>>> parts of GBM that are no longer needed?  idk..  Either way, I fully
>>> expect that GBM and mesa's implementation of $new_thing could perhaps
>>> sit on to of some of the same set of internal APIs.  The public
>>> interface can be decoupled from the internal implementation.
>>
>>
>> Maybe I should restate things a bit.  My real point was that modifiers +
>> $new_thing + Kernel blob should be a complete and more powerful replacement
>> for GBM.  I don't know that we really can implement GBM on top of it because
>> GBM has lots of wishy-washy concepts such as "cursor plane" which may not
>> map well at least not without querying the kernel about specifc display
>> planes.  In particular, I don't want someone to feel like they need to use
>> $new_thing and GBM at the same time or together.  Ideally, I'd like them to
>> never do that unless we decide gbm_bo is a useful abstraction for
>> $new_thing.
>>
> 
> (just to repeat what I mentioned on irc)
> 
> I think main thing is how do you create a swapchain/surface and know
> which is current front buffer after SwapBuffers()..  that is the only
> bits of GBM that seem like there would still be useful.  idk, maybe
> there is some other idea.

I don't view this as terribly useful except for legacy apps that need an 
EGL window surface and can't be updated to use new methods.  Wayland 
compositors certainly don't fall in that category.  I don't know that 
any GBM apps do.

Rather, I think the way forward for the classes of apps that need 
something like GBM or the generic allocator is more or less the path 
ChromeOS took with their graphics architecture: Render to individual 
buffers (using FBOs bound to imported buffers in GL) and manage buffer 
exchanges/blits manually.

The useful abstraction surfaces provide isn't so much deciding which 
buffer is currently "front" and "back", but rather handling the 
transition/hand-off to the window system/display device/etc. in 
SwapBuffers(), and the whole idea of the allocator proposals is to make 
that something the application or at least some non-driver utility 
library handles explicitly based on where exactly the buffer is being 
handed off to.

The one other useful information provided by EGL surfaces that I suspect 
only our hardware cares about is whether the app is potentially going to 
bind a depth buffer along with the color buffers from the surface, and 
AFAICT, the GBM notion of surfaces doesn't provide enough information 
for our driver to determine that at surface creation time, so the GBM 
surface mechanism doesn't fit quite right with NVIDIA hardware anyway.

That's all for the compositors, embedded apps, demos, and whatnot that 
are using GBM directly though.  Every existing GL wayland client needs 
to be able to get an EGLSurface and call eglSwapBuffers() on it.  As I 
mentioned in my XDC 2017 slides, I think that's best handled by a 
generic EGL window system implementation that all drivers could share, 
and which uses allocator mechanisms behind the scenes to build up an 
EGLSurface from individual buffers.  It would all have to be transparent 
to apps, but we already had that working with our EGLStreams wayland 
implementation, and the Mesa Wayland EGL client does roughly the same 
thing with DRM or GBM buffers IIRC, but without a driver-external 
interface.  It should be possible with generic allocator buffers too. 
Jason's Vulkan WSI improvements that were sent out recently move Vulkan 
in that direction already as well, and that was always one of the goals 
of the Vulkan external objects extensions.

This is all a really long-winded way of saying yeah I think it would be 
technically feasible to implement GBM on top of the generic allocator 
mechanisms, but I don't think that's a very interesting undertaking. 
It'd just be an ABI-compatibility thing for a bunch of open-source apps, 
which seems unnecessary in the long run since the apps can just be 
patched instead.  Maybe it's useful as a transition mechanism though.

However, if the generic allocator is going to be something separate from 
GBM, I think the idea of modernizing & adapting the existing GBM backend 
infrastructure in Mesa to serve as a backend for the allocator is a good 
idea.  Maybe it's easier to just let GBM sit on that same updated 
backend beside the allocator API.  For GBM, all the interesting stuff 
happens in the backend anyway.

Thanks,
-James

> BR,
> -R
> 
> 
>>>
>>>> I *think* I like the idea of having $new_thing implement GBM as a
>>>> deprecated
>>>> legacy API.  Whether that means we start by pulling GBM out into it's
>>>> own
>>>> project or we start over, I don't know.  My feeling is that the current
>>>> dri_interface is *not* what we want which is why starting with GBM makes
>>>> me
>>>> nervous.
>>>
>>> /me expects if we pull GBM out of mesa, the interface between GBM and
>>> mesa (or other GL drivers) is 'struct gbm_device'.. so "GBM the
>>> project" is just a thin shim plus some 'struct gbm_device' versioning.
>>>
>>> BR,
>>> -R
>>>
>>>> I need to go read through your code before I can provide a stronger or
>>>> more
>>>> nuanced opinion.  That's not going to happen before the end of the year.
>>>>
>>>>>> -I have also heard some general comments that regardless of the
>>>>>> relationship
>>>>>> between GBM and the new allocator mechanisms, it might be time to move
>>>>>> GBM
>>>>>> out of Mesa so it can be developed as a stand-alone project.  I'd be
>>>>>> interested what others think about that, as it would be something
>>>>>> worth
>>>>>> coordinating with any other new development based on or inside of GBM.
>>>>>
>>>>>
>>>>> +1
>>>>>
>>>>> We already have at least a couple different non-mesa implementations
>>>>> of GBM (which afaict tend to lag behind mesa's GBM and cause
>>>>> headaches).
>>>>>
>>>>> The extracted part probably isn't much more than a header and shim.
>>>>> But probably does need to grow some versioning for the backend to know
>>>>> if, for example, gbm->bo_map() is supported.. at least it could
>>>>> provide stubs that return an error, rather than having link-time fail
>>>>> if building something w/ $vendor's old gbm implementation.
>>>>>
>>>>>> And of course I'm open to any other ideas for integration.  Beyond
>>>>>> just
>>>>>> where this code would live, there is much to debate about the
>>>>>> mechanisms
>>>>>> themselves and all the implementation details.  I was just hoping to
>>>>>> kick
>>>>>> things off with something high level to start.
>>>>>
>>>>>
>>>>> My $0.02, is that the place where devel happens and place to go for
>>>>> releases could be different.  Either way, I would like to see git tree
>>>>> for tagged release versions live on fd.o and use the common release
>>>>> process[2] for generating/uploading release tarballs that distros can
>>>>> use.
>>>>
>>>>
>>>> Agreed.  I think fd.o is the right place for such a project to live.  We
>>>> can
>>>> have mirrors on GitHub and other places but fd.o is where Linux graphics
>>>> stack development currently happens.
>>>>
>>>>> [2] https://cgit.freedesktop.org/xorg/util/modular/tree/release.sh
>>>>>
>>>>>> For reference, the code Miguel and I have been developing for the
>>>>>> prototype
>>>>>> is here:
>>>>>>
>>>>>>     https://github.com/cubanismo/allocator
>>>>>>
>>>>>> And we've posted a port of kmscube that uses the new interfaces as a
>>>>>> demonstration here:
>>>>>>
>>>>>>     https://github.com/cubanismo/kmscube
>>>>>>
>>>>>> There are still some proposed mechanisms (usage transitions mainly)
>>>>>> that
>>>>>> aren't prototyped, but I think it makes sense to start discussing
>>>>>> integration while prototyping continues.
>>>>>
>>>>>
>>>>> btw, I think a nice end goal would be a gralloc implementation using
>>>>> this new API for sharing buffers in various use-cases.  That could
>>>>> mean converting gbm-gralloc, or perhaps it means something new.
>>>>>
>>>>> AOSP has support for mesa + upstream kernel for some devices which
>>>>> also have upstream camera and/or video decoder in addition to just
>>>>> GPU.. and this is where you start hitting the limits of a GBM based
>>>>> gralloc.  In a lot of way, I view $new_thing as what gralloc *should*
>>>>> have been, but at least it provides a way to implement a generic
>>>>> gralloc.
>>>>
>>>>
>>>> +100
>>>>
>>>>
>>>>> Maybe that is getting a step ahead, there is a lot we can prototype
>>>>> with kmscube.  But gralloc gets us into interesting real-world
>>>>> use-cases that involve more than just GPUs.  Possibly this would be
>>>>> something that linaro might be interested in getting involved with?
>>>>>
>>>>> BR,
>>>>> -R
>>>>>
>>>>>> In addition, I'd like to note that NVIDIA is committed to providing
>>>>>> open
>>>>>> source driver implementations of these mechanisms for our hardware, in
>>>>>> addition to support in our proprietary drivers.  In other words,
>>>>>> wherever
>>>>>> modifications to the nouveau kernel & userspace drivers are needed to
>>>>>> implement the improved allocator mechanisms, we'll be contributing
>>>>>> patches
>>>>>> if no one beats us to it.
>>>>>>
>>>>>> Thanks in advance for any feedback!
>>>>>>
>>>>>> -James Jones
>>>>>> _______________________________________________
>>>>>> mesa-dev mailing list
>>>>>> mesa-dev at lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>
>>>>
>>>>
>>
>>