dma-buf non-coherent mmap
Thomas Hellstrom
thellstrom at vmware.com
Mon Nov 4 11:48:38 CET 2013
On 11/04/2013 11:22 AM, Lucas Stach wrote:
> Hi Thomas,
>
> I inlined the patch, to discuss more easily.
>
> Am Montag, den 04.11.2013, 08:53 +0100 schrieb Thomas Hellstrom:
>> enum dma_buf_data_direction {
>> DMA_BUF_BIDIRECTIONAL,
>> DMA_BUF_TO_DEVICE,
>> DMA_BUF_FROM_DEVICE
>> };
> I don't think the DMA API semantic makes much sense here. Doing a sync
> from device is at least a bit complicated, as we would have to track
> down the device in which domain the buffer is currently residing for
> write. This may be doable by bouncing the sync op from the exporter to
> the right attachment.
>
> The other way around is even more messier: a DMA-BUF is by definition
> shared between multiple devices, so which of them should you sync to?
> All of them?
>
> I think the API used for userspace access should only be concerned with
> making sure the CPU gets a coherent view to the resource at the point of
> sharing i.e. sysmem. All DMA-BUF users should make sure that they sync
> their writes to the point of sharing strictly before signaling that they
> are done with the buffer.
> Maybe this is just a naming problem, but IMHO the API should make it
> really clear that the sync only makes sure that the data reaches the
> point of sharing.
Good point. I think the options need to remain, but we could rename
DEVICE to STORAGE, or whatever In our case,
sync_for_cpu(DMA_BUF_TO_STORAGE) would be a no-op, whereas,
sync_for_cpu(DMA_BUF_BIDIRECTIONAL) would trigger a DMA read.
>
>> /**
>> * struct dma_buf_sync_arg - Argument to
>> *
>> * @start_x: Upper left X coordinate of area to be synced.
>> * Unit is format-dependent
>> * @start_y: Upper left Y coordinate of area to be synced.
>> * Unit is format-dependent.
>> * @width: Width of area to be synced. Grows to the right.
>> * Unit is format-dependent.
>> * @height: Height of area to be synced. Grows downwards.
>> * Unit is format-dependent.
> While I see why you would need this I don't really like the concept of a
> two dimensional resource pushed into the API. You already need the
> linear flag to make this work with 1D resources and I don't see what
> happens when you encounter 3D or array resources.
>
> Don't take this as a strong rejection as I see why this is useful, but I
> don't like how the interface looks like right now.
I understand. I don't think 1D only syncing options would suffice
performance-wise, as
the use-case I'm most afraid of is people trying to share 2D contents
between user-space
processes.
What do you think about supplying descriptor objects that could either
describe a
linear area, 2D area or whatever? Whenever (if) a new descriptor type is
added to the interface we could
have a legacy implementation that implements it in terms of 1D and 2D syncs.
>
>> * @direction: Intended transfer direction of data.
>> * @flags: Flags to tune the synchronizing behaviour.
>> */
>>
>> struct dma_buf_sync_region_arg {
>> __u32 buf_identifier;
>> __u32 start_x;
>> __u32 start_y;
>> __u32 width;
>> __u32 height;
>> enum dma_buf_data_direction direction;
>> __u32 flags;
>> __u32 pad;
>> };
>>
>> /**
>> * Force sync any outstanding rendering of the exporter before
>> returning.
>> * This is strictly a performance hint. The user may omit this flag to
>> * speed up execution of the call, if it is known that previous
>> rendering
>> * affecting (by read or write) the synchronized area has already
>> finished.
>> */
>> #define DMA_BUF_SYNC_FLAG_SYNC (1 << 0);
>>
>> /**
>> * Treat @start_x as byte offset into buffer and @width as byte
>> * synchronization length. The values of @start_y and @height are
>> ignored.
>> * (Separate ioctl?)
>> */
>> #define DMA_BUF_SYNC_FLAG_LINEAR (1 << 1);
>>
>> /**
>> * Allow the implementation to coalesce sync_for_device calls, until
>> either
>> * a) An explicit flush
>> * b) A sync for cpu call with DMA_BUF_BIDIRECTIONAL or
>> DMA_BUF_TO_DEVICE
>> *
>> * Note: An implementation may choose to ignore this flag.
>> */
>> #define DMA_BUF_SYNC_FLAG_COALESCE (1 << 2);
>>
>> /**
>> * IOCTLS-
>> *
>> * Kernel waits should, if possible, be performed interruptible, and
>> the
>> * ioctl may sett errno to EINTR if the ioctl needs to be restarted.
>> * To be discussed: Any sync operation may not affect areas outside
>> the
>> * region indicated. (Good for vmwgfx, but plays ill with cache line
>> alignment)
>> */
>>
>> /**
>> * Sync for CPU.
>> */
>> #define DMA_BUF_SYNC_REGION_FOR_CPU \
>> _IOW(DMA_BUF_BASE, 0, struct dma_buf_sync_region_arg)
>>
>> /**
>> * Sync for device. This is the default state of a dma-buf.
>> */
>> #define DMA_BUF_SYNC_REGION_FOR_DEVICE \
>> _IOW(DMA_BUF_BASE, 1, struct dma_buf_sync_region_arg)
>>
>> /**
>> * Flush any coalesced SYNC_REGION_FOR_DEVICE
>> */
>> #define DMA_BUF_SYNC_REGION_FLUSH \
>> _IOW(DMA_BUF_BASE, 2, __u32)
>>
> What is the use-case for this? Why don't you think that usespace could
> be smart enough to coalesce the flush on its own?
Right. A thinko. We could clearly leave that out.
/Thomas
More information about the dri-devel
mailing list