Unix Device Memory Allocation project

Marek Olšák maraeo at gmail.com
Tue Oct 18 23:40:41 UTC 2016


The text below describes how open source AMDGPU buffer sharing works.
I hope you'll find some useful bits in it.

Producer = allocates a buffer (or texture), and exports its handle
(DMABUF, etc.), and can use the buffer in various ways

Consumer = imports the handle, and can use the buffer in various ways

*** Producer-consumer interaction. ***

1) On handle export, the producer receives these flags:

- READ, WRITE, READ+WRITE: Describe the expected usage in the consumer.
  * The producer decides if it needs to disable compression based on
those flags.

- EXPLICIT_FLUSH flag: Meaning that the producer will explicitly
receive a "flush_resource" call before the consumer starts using the
buffer. This is a hint that the producer doesn't have to keep track of
"when to do decompression" when sharing the buffer with the consumer.

2) Passing metadata (tiling, pixel ordering, format, layout) info
between the producer and consumer:

- All AMDGPU buffer/texture allocations have 256 bytes (64 dwords) of
internal per-allocation metadata storage that lives in the kernel
space. There are amdgpu-specific ioctls that can "set" and "get" the
metadata. Any process that has a buffer handle can do that.
  * The produces writes the metadata, the consumer reads it.

- The producer-consumer interop API doesn't know about the metadata.
All you need to pass around is a buffer handle. (KMS, DMABUF, etc.)
  * There was a note during the talk that DMABUF doesn't have any
metadata. Well, I just told you that it has, but it's private to
amdgpu and possibly accessible to other kernel drivers too.
  * We can build upon this idea. I think the worst thing to do would
be to add metadata handling to driver-agnostic userspace APIs. Really,
driver-agnostic APIs shouldn't know about that, because they can't
understand all the hw-specific information encoded in the metadata.
Also, when you want to change the metadata format, you only have to
update the affected drivers, not userspace APIs.

3) Internal AMDGPU metadata storage format
- The header contains: Vendor ID, PCI ID, and version number.
- The header is followed by PCI-ID-specific data. The PCI ID and the
version number define the format.
- If the consumer runs on a different device, it must read the header
and parse the metadata based on that. It implies that the
driver-specific consumer code needs to know about all potential
producer devices.

Bottom line: DMABUF handles alone are fully sufficient for sharing
buffers/textures between devices and processes from the AMDGPU point
of view.

HW driver implementation: The driver doesn't know anything about the
users of exported or imported buffers. It only acts based on the few
flags described in section 1. So far that's all we've needed.

*** Use cases ***

1) DRI (producer: application; consumer: X server)
- The producer receives these flags: READ, EXPLICIT_FLUSH. The X
server will treat the shared "texture" as read-only. EXPLICIT_FLUSH
ensures the texture can be compressed, and "flush_resource" will be
called as part of SwapBuffers and "glFlush: GL_FRONT".
- The X server can run on a different device. In that case, the window
system API passes the "LINEAR" flag to the driver during allocation.
That's suboptimal and fixable.

2) OpenGL-OpenCL interop (OpenGL always exports handles, OpenCL always
imports handles)
- Possible flags: READ, WRITE, READ+WRITE
- OpenCL doesn't give us any other flags, so we are stuck with those.
- Inter-device sharing is possible if the consumer understands the
producer's metadata and tiling layouts.

(amdgpu actually stores 2 different metadata blocks per allocation,
but the simpler one is too limited and has only 8 bytes)


On Wed, Oct 5, 2016 at 1:47 AM, James Jones <jajones at nvidia.com> wrote:
> Hello everyone,
> As many are aware, we took up the issue of surface/memory allocation at XDC
> this year.  The outcome of that discussion was the beginnings of a design
> proposal for a library that would server as a cross-device, cross-process
> surface allocator.  In the past week I've started to condense some of my
> notes from that discussion down to code & a design document.  I've posted
> the first pieces to a github repository here:
>   https://github.com/cubanismo/allocator
> This isn't anything close to usable code yet.  Just headers and docs, and
> incomplete ones at that.  However, feel free to check it out if you're
> interested in discussing the design.
> Thanks,
> -James
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

More information about the dri-devel mailing list