dma-buf non-coherent mmap

Thomas Hellstrom thellstrom at vmware.com
Thu Oct 31 21:40:53 CET 2013


On 10/31/2013 06:52 PM, Rob Clark wrote:
> On Thu, Oct 31, 2013 at 1:00 PM, Thomas Hellstrom <thellstrom at vmware.com> wrote:
>> Hi!
>>
>> I'm just looking over what's needed to implement drm Prime / dma-buf exports
>> + imports in the vmwgfx driver. It seems like most dma-bufs ops are quite
>> straightforward to implement except user-space mmap().
>>
>> The reason being that vmwgfx dma-bufs will be using completely non-coherent
>> memory, whenever there needs to be CPU accesses.
>>
>> The accelerated contents resides in an opaque structure on the device into
>> which we can DMA to and from, so for mmap to work we need to zap ptes and
>> DMA to the device when doing something accelerated, and on the first
>> page-fault DMA data back and wait for idle if the device did a write to the
>> dma-buf.
>>
>> Now this shouldn't really be a problem if dma-bufs were only used for
>> cross-device sharing, but since people apparently want to use dma-buf file
>> handles to share CPU data between processes it really becomes a serious
>> problem.
>>
>> Needless to say we'd want to limit the size of the DMAs, and have mmap users
>> request regions for read, and mark regions dirty for write, something
>> similar to gallium's texture transfer objects.
>>
>> Any ideas?
> well, I think vmwgfx is part of the reason we decided mmap would be
> optional for dmabuf.  So perhaps it is an option to simply ignore
> mmap?
>
> BR,
> -R

Well, I'd be happy to avoid mmap, but then what does optional mean in 
this context?
That all generic user-space apps *must* implement a workaround if mmap 
isn't implemented?

It's unfortunate a bit like implicit synchronization mentioned in 
section 3) in Direct Userspace Access/mmap Support
in the kernel dma-buf doc: It should be avoided, otherwise it might be 
relied upon by userspace and exporters
not implementing it will suffer.

In reality, people will start using mmap() and won't care to implement 
workarounds if it's not supported, and drivers like
vmwgfx and non-coherent architectures will suffer.

I haven't looked closely at how DRI3 or Wayland/weston use or will use 
dma-buf, but if they rely on mmap, we're sort
of lost. MIR uses the following scheme:
1) Create a GBM buffer
2) Get a Prime handle, export to client
3) Client imports prime handle, casts into a GBM buffer which is 
typecast to a dumb buffer
4) Client uses DRM dumb mmap().   :(.

Now, step 3) and step 4) are clearly API violations, but it works by 
coincidence on major drivers.
Now this usage-pattern could easily (and IMHO should) be blocked in DRM 
by refusing to map dumb buffers
that haven't been created as dumb buffers, but If dma-bufs are to be 
shared by user-space apps to transfer
contents in this manner, we need a reliable replacement that works also 
for non-coherent situations.

Ideally I'd like to see tex[sub]image-like operations, but with dma-buf 
mmap in place, I'm afraid that's what will
be used...

Regars,
Thomas











>
>> /Thomas
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel


More information about the dri-devel mailing list