[PATCH] media: videobuf2: sync caches for dmabuf memory

Hans Verkuil hverkuil-cisco at xs4all.nl
Thu Jun 20 06:51:56 UTC 2024


On 19/06/2024 06:19, Tomasz Figa wrote:
> On Wed, Jun 19, 2024 at 1:24 AM Nicolas Dufresne <nicolas at ndufresne.ca> wrote:
>>
>> Le mardi 18 juin 2024 à 16:47 +0900, Tomasz Figa a écrit :
>>> Hi TaoJiang,
>>>
>>> On Tue, Jun 18, 2024 at 4:30 PM TaoJiang <tao.jiang_2 at nxp.com> wrote:
>>>>
>>>> From: Ming Qian <ming.qian at nxp.com>
>>>>
>>>> When the memory type is VB2_MEMORY_DMABUF, the v4l2 device can't know
>>>> whether the dma buffer is coherent or synchronized.
>>>>
>>>> The videobuf2-core will skip cache syncs as it think the DMA exporter
>>>> should take care of cache syncs
>>>>
>>>> But in fact it's likely that the client doesn't
>>>> synchronize the dma buf before qbuf() or after dqbuf(). and it's
>>>> difficult to find this type of error directly.
>>>>
>>>> I think it's helpful that videobuf2-core can call
>>>> dma_buf_end_cpu_access() and dma_buf_begin_cpu_access() to handle the
>>>> cache syncs.
>>>>
>>>> Signed-off-by: Ming Qian <ming.qian at nxp.com>
>>>> Signed-off-by: TaoJiang <tao.jiang_2 at nxp.com>
>>>> ---
>>>>  .../media/common/videobuf2/videobuf2-core.c   | 22 +++++++++++++++++++
>>>>  1 file changed, 22 insertions(+)
>>>>
>>>
>>> Sorry, that patch is incorrect. I believe you're misunderstanding the
>>> way DMA-buf buffers should be managed in the userspace. It's the
>>> userspace responsibility to call the DMA_BUF_IOCTL_SYNC ioctl [1] to
>>> signal start and end of CPU access to the kernel and imply necessary
>>> cache synchronization.
>>>
>>> [1] https://docs.kernel.org/driver-api/dma-buf.html#dma-buffer-ioctls
>>>
>>> So, really sorry, but it's a NAK.
>>
>>
>>
>> This patch *could* make sense if it was inside UVC Driver as an example, as this
>> driver can import dmabuf, to CPU memcpy, and does omits the required sync calls
>> (unless that got added recently, I can easily have missed it).
> 
> Yeah, currently V4L2 drivers don't call the in-kernel
> dma_buf_{begin,end}_cpu_access() when they need to access the buffers
> from the CPU, while my quick grep [1] reveals that we have 68 files
> retrieving plane vaddr by calling vb2_plane_vaddr() (not necessarily a
> 100% guarantee of CPU access being done, but rather likely so).
> 
> I also repeated the same thing with VB2_DMABUF [2] and tried to
> attribute both lists to specific drivers (by retaining the path until
> the first - or _ [3]; which seemed to be relatively accurate), leading
> to the following drivers that claim support for DMABUF while also
> retrieving plane vaddr (without proper synchronization - no drivers
> currently call any begin/end CPU access):
> 
>  i2c/video
>  pci/bt8xx/bttv
>  pci/cobalt/cobalt
>  pci/cx18/cx18
>  pci/tw5864/tw5864
>  pci/tw686x/tw686x
>  platform/allegro
>  platform/amphion/vpu
>  platform/chips
>  platform/intel/pxa
>  platform/marvell/mcam
>  platform/mediatek/jpeg/mtk
>  platform/mediatek/vcodec/decoder/mtk
>  platform/mediatek/vcodec/encoder/mtk
>  platform/nuvoton/npcm
>  platform/nvidia/tegra
>  platform/nxp/imx
>  platform/renesas/rcar
>  platform/renesas/vsp1/vsp1
>  platform/rockchip/rkisp1/rkisp1
>  platform/samsung/exynos4
>  platform/samsung/s5p
>  platform/st/sti/delta/delta
>  platform/st/sti/hva/hva
>  platform/verisilicon/hantro
>  usb/au0828/au0828
>  usb/cx231xx/cx231xx
>  usb/dvb
>  usb/em28xx/em28xx
>  usb/gspca/gspca.c
>  usb/hackrf/hackrf.c
>  usb/stk1160/stk1160
>  usb/uvc/uvc
> 
> which means we potentially have ~30 drivers which likely don't handle
> imported DMABUFs correctly (there is still a chance that DMABUF is
> advertised for one queue, while vaddr is used for another).
> 
> I think we have two options:
> 1) add vb2_{begin/end}_cpu_access() helpers, carefully audit each
> driver and add calls to those

I actually started on that 9 (!) years ago:

https://git.linuxtv.org/hverkuil/media_tree.git/log/?h=vb2-cpu-access

If memory serves, the main problem was that there were some drivers where
it wasn't clear what should be done. In the end I never continued this
work since nobody complained about it.

This patch series adds vb2_plane_begin/end_cpu_access() functions,
replaces all calls to vb2_plane_vaddr() in drivers to the new functions,
and at the end removes vb2_plane_vaddr() altogether.

> 2) take a heavy gun approach and just call vb2_begin_cpu_access()
> whenever vb2_plane_vaddr() is called and then vb2_end_cpu_access()
> whenever vb2_buffer_done() is called (if begin was called before).
> 
> The latter has the disadvantage of drivers not having control over the
> timing of the cache sync, so could end up with less than optimal
> performance. Also there could be some more complex cases, where the
> driver needs to mix DMA and CPU accesses to the buffer, so the fixed
> sequence just wouldn't work for them. (But then they just wouldn't
> work today either.)
> 
> Hans, Marek, do you have any thoughts? (I'd personally just go with 2
> and if any driver in the future needs something else, they could call
> begin/end CPU access manually.)

I prefer 1. If nothing else, that makes it easy to identify drivers
that do such things.

But perhaps a mix is possible: if a VB2 flag is set by the driver, then
approach 2 is used. That might help with the drivers where it isn't clear
what they should do. Although perhaps this can all be done in the driver
itself: instead of vb2_plane_vaddr they call vb2_begin_cpu_access for the
whole buffer, and at buffer_done time they call vb2_end_cpu_access. Should
work just as well for the very few drivers that need this.

Regards,

	Hans

> 
> [1] git grep vb2_plane_vaddr | cut -d":" -f 1 | sort | uniq
> [2] git grep VB2_DMABUF | cut -d":" -f 1 | sort | uniq
> [3] by running [1] and [2] through | cut -d"-" -f 1 | cut -d"_" -f 1 | uniq
> 
> Best,
> Tomasz
> 
>>
>> But generally speaking, bracketing all driver with CPU access synchronization
>> does not make sense indeed, so I second the rejection.
>>
>> Nicolas
>>
>>>
>>> Best regards,
>>> Tomasz
>>>
>>>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>>>> index 358f1fe42975..4734ff9cf3ce 100644
>>>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>>>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>>>> @@ -340,6 +340,17 @@ static void __vb2_buf_mem_prepare(struct vb2_buffer *vb)
>>>>         vb->synced = 1;
>>>>         for (plane = 0; plane < vb->num_planes; ++plane)
>>>>                 call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
>>>> +
>>>> +       if (vb->memory != VB2_MEMORY_DMABUF)
>>>> +               return;
>>>> +       for (plane = 0; plane < vb->num_planes; ++plane) {
>>>> +               struct dma_buf *dbuf = vb->planes[plane].dbuf;
>>>> +
>>>> +               if (!dbuf)
>>>> +                       continue;
>>>> +
>>>> +               dma_buf_end_cpu_access(dbuf, vb->vb2_queue->dma_dir);
>>>> +       }
>>>>  }
>>>>
>>>>  /*
>>>> @@ -356,6 +367,17 @@ static void __vb2_buf_mem_finish(struct vb2_buffer *vb)
>>>>         vb->synced = 0;
>>>>         for (plane = 0; plane < vb->num_planes; ++plane)
>>>>                 call_void_memop(vb, finish, vb->planes[plane].mem_priv);
>>>> +
>>>> +       if (vb->memory != VB2_MEMORY_DMABUF)
>>>> +               return;
>>>> +       for (plane = 0; plane < vb->num_planes; ++plane) {
>>>> +               struct dma_buf *dbuf = vb->planes[plane].dbuf;
>>>> +
>>>> +               if (!dbuf)
>>>> +                       continue;
>>>> +
>>>> +               dma_buf_begin_cpu_access(dbuf, vb->vb2_queue->dma_dir);
>>>> +       }
>>>>  }
>>>>
>>>>  /*
>>>> --
>>>> 2.43.0-rc1
>>>>
>>
> 



More information about the dri-devel mailing list