[RFC v4 13/18] vb2: Don't sync cache for a buffer if so requested

Tomasz Figa tfiga at chromium.org
Wed May 10 11:00:10 UTC 2017


Hi Sakari,

Few comments inline.

On Mon, May 8, 2017 at 11:03 PM, Sakari Ailus
<sakari.ailus at linux.intel.com> wrote:
> From: Samu Onkalo <samu.onkalo at intel.com>
>
> The user may request to the driver (vb2) to skip the cache maintenance
> operations in case the buffer does not need cache synchronisation, e.g. in
> cases where the buffer is passed between hardware blocks without it being
> touched by the CPU.
[snip]
> @@ -1199,6 +1236,11 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
>                         dprintk(1, "buffer initialization failed\n");
>                         goto err;
>                 }
> +
> +               /* This is new buffer memory --- always synchronise cache. */
> +               __mem_prepare_planes(vb);
> +       } else if (!no_cache_sync) {
> +               __mem_prepare_planes(vb);

Do we actually need this at all for DMABUF, given that respective
callbacks in both vb2_dc and vb2_sg actually bail out if so?

>         }
>
>         ret = call_vb_qop(vb, buf_prepare, vb);
[snip]
> @@ -568,7 +571,11 @@ int vb2_qbuf(struct vb2_queue *q, struct v4l2_buffer *b)
>         }
>
>         ret = vb2_queue_or_prepare_buf(q, b, "qbuf");
> -       return ret ? ret : vb2_core_qbuf(q, b->index, b);
> +       if (ret)
> +               return ret;
> +
> +       return vb2_core_qbuf(q, b->index, b,
> +                            b->flags & V4L2_BUF_FLAG_NO_CACHE_SYNC);

Can we really let the userspace alone control this? I believe there
are drivers that need to do some fixup in OUTPUT buffers before
sending to the hardware or in CAPTURE buffer after getting from the
hardware, respectively in buf_prepare or buf_finish. I believe such
driver needs to be able to override this behavior.

Actually I'm wondering if we really need this buffer flag at all.
Wouldn't the following work for typical use cases that we actually
care about performance of?

buffer_needs_cache_sync = (buffer_type_is_MMAP &&
buffer_is_non_coherent && (buffer_is_mmapped ||
buffer_has_kernel_mapping)) || buffer_is_USERPTR

The above should cover all the fast paths that are used only to
exchange data between devices, without the CPU involved, assuming that
drivers that don't need the fixups I mentioned before are properly
updated to request no kernel mapping for allocated buffers.

I've added (buffer_is_USERPTR) to the equation as it's really hard to
imagine a use case where there is no CPU access to the buffer, but
USERPTR needs to be used (instead of DMABUF). I might be missing
something, though.

Best regards,
Tomasz


More information about the dri-devel mailing list