Unix Device Memory Allocation project

Wed Jan 4 12:03:59 UTC 2017

Hi Marek,

On 3 January 2017 at 23:38, Marek Olšák <maraeo at gmail.com> wrote:
> I've been thinking about it, and it looks like we're gonna continue
> using immutable per-BO metadata (buffer layout, tiling description,
> compression flags). The reasons are that everything else is less
> economical, and the current "modifier" work done in EGL/GBM is
> insufficient for our hardware - we need approx. 96 bytes of metadata
> for proper buffer sharing (not just for display, but also 3D interop -
> MSAA, mipmapping, compression), while EGL modifiers only support 8
> bytes of metadata. However, that doesn't matter, because:

You're right that no-one attempts to describe MSAA/full-miptree layout
within modifiers, and that's good because that's not what they're
supposed to do. The various consumers (DRM framebuffers, EGLImages,
wl_buffers, V4L2 buffers) all just work with flat, fully-resolved, 2D
image blocks. If you were adding miptree detail to KMS modifiers, then
I'd expect to see matching patches to, e.g., select LOD for each of
the above.

So I don't see how the above is relevant to the problems that the
allocator solves, unless we really can scan out from (and exchange
between different-vendor GPUs) far more exotic buffer formats these
days.

> These are the components that need to work with the BO metadata:
> - Mesa driver backend
> - AMDGPU kernel driver

You've pretty correctly identified this though, and I'm happy to run
you through how Wayland works wrt EGL and buffer interchange on IRC,
if you'd like. But as DanV says, the client<->compositor protocol is
entirely contained within Mesa, so you can change it entirely
arbitrarily without worrying about version desync.

> These are the components that should never know about the BO metadata:
> - Any Mesa shared code
> - EGL
> - GBM
> - Window system protocols
> - Display servers
> - DDXs

Again, most of these don't seem overly relevant, since the types of
allocations you're talking about are not going to transit these
components in the first place.

> The more components you need to change when the requirements change,
> the less economical the whole thing is, and the more painful the
> deployment is.

I don't think anyone disagrees; the point was to write this such that
no changes would be required to any of those components. As a trivial
example, between the GETPLANE2 ioctl and being able to pass modifiers
into GBM, Weston can now instruct Mesa to render buffers with
compression or exotic tiling formats, without ever having to have
specific knowledge of what those formats mean. Adding more formats
doesn't mean changing Weston, because it doesn't know or care about
the details.

> Interop with other vendors would be trivial - the kernel drivers can
> exchange buffer layouts, and DRM can have an interface for it.

Describing them might not be the most difficult thing in the world,
though the regret starts to pile up as, thanks to the wonders of PCI-E
ARM systems, almost all of AMD / Intel / NVIDIA / ARM / Qualcomm have
to be mutually aware of each other's buffer-descriptor layout, and
every version thereof (sure you won't ever need more than 96 bytes,
ever?). But how does that solve allocation? How does my amdgpu kernel
driver 'know' whether its buffers are going to be scanned out or run
through intermediate GPU composition, and furthermore whether that
will happen on an AMD, Intel, or NVIDIA GPU? How does my Intel GPU
know that its output will be consumed by a media encode engine as well
as scanned out, so it can't use exotic tiling modes?

Putting this kind of negotiation in the kernel was roundly rejected a
long time ago, not least as the display pipeline arrangement is a
policy decision made by userspace, frame by frame.

> Userspace doesn't have to know about any of that. (It also seems kinda
> dangerous to use userspace as a middle man for passing the
> metadata/modifiers around)

Why dangerous? If it can be dangerous, i.e. a malicious userspace
driver can compromise your system, then I'd be looking at the
validation in your kernel driver really ...

> Speaking of compression for display, especially the separate
> compression buffer: That should be fully contained in the main DMABUF
> and described by the per-BO metadata. Some other drivers want to use a
> separate DMABUF for the compression buffer - while that may sound good
> in theory, it's not economical for the reason described above.

'Some other drivers want to use a separate DMABUF', or 'some other
hardware demands the data be separate'. Same with luma/chroma plane
separation. Anyway, it doesn't really matter unless you're sharing
render-compression formats across vendors, and AFBC is the only case
of that I know of currently.

Cheers,
Daniel