[PATCH 2/2] [WIP]: media: Add Synaptics compressed tiled format

Sat Aug 20 00:10:38 UTC 2022

On 8/20/22 03:17, Nicolas Dufresne wrote:
> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Le vendredi 19 août 2022 à 23:44 +0800, Hsia-Jun Li a écrit :
>>
>> On 8/19/22 23:28, Nicolas Dufresne wrote:
>>> CAUTION: Email originated externally, do not click links or open attachments unless you recognize the sender and know the content is safe.
>>>
>>>
>>> Le vendredi 19 août 2022 à 02:13 +0300, Laurent Pinchart a écrit :
>>>> On Thu, Aug 18, 2022 at 02:33:42PM +0800, Hsia-Jun Li wrote:
>>>>> On 8/18/22 14:06, Tomasz Figa wrote:
>>>>>> On Tue, Aug 9, 2022 at 1:28 AM Hsia-Jun Li <randy.li at synaptics.com> wrote:
>>>>>>>
>>>>>>> From: "Hsia-Jun(Randy) Li" <randy.li at synaptics.com>
>>>>>>>
>>>>>>> The most of detail has been written in the drm.
>>>>
>>>> This patch still needs a description of the format, which should go to
>>>> Documentation/userspace-api/media/v4l/.
>>>>
>>>>>>> Please notice that the tiled formats here request
>>>>>>> one more plane for storing the motion vector metadata.
>>>>>>> This buffer won't be compressed, so you can't append
>>>>>>> it to luma or chroma plane.
>>>>>>
>>>>>> Does the motion vector buffer need to be exposed to userspace? Is the
>>>>>> decoder stateless (requires userspace to specify the reference frames)
>>>>>> or stateful (manages the entire decoding process internally)?
>>>>>
>>>>> No, users don't need to access them at all. Just they need a different
>>>>> dma-heap.
>>>>>
>>>>> You would only get the stateful version of both encoder and decoder.
>>>>
>>>> Shouldn't the motion vectors be stored in a separate V4L2 buffer,
>>>> submitted through a different queue then ?
>>>
>>> Imho, I believe these should be invisible to users and pooled separately to
>>> reduce the overhead. The number of reference is usually lower then the number of
>>> allocated display buffers.
>>>
>> You can't. The motion vector buffer can't share with the luma and chroma
>> data planes, nor the data plane for the compression meta data.
>>
>> You could consider this as a security requirement(the memory region for
>> the MV could only be accessed by the decoder) or hardware limitation.
>>
>> It is also not very easy to manage such a large buffer that would change
>> when the resolution changed.
> 
> Your argument are just aiming toward the fact that you should not let the user
> allocate these in the first place. They should not be bound to the v4l2 buffer.
> Allocate these in your driver, and leave to your user the pixel buffer (and
> compress meta) allocation work.
> 
What I want to say is that userspace could allocate buffers then make 
the v4l2 decoder import these buffers, but each planes should come from 
the right DMA-heaps. Usually the userspace would know better the memory 
occupation, it would bring some flexibility here.

Currently, they are another thing bothers me, I need to allocate a small 
piece of memory(less than 128KiB) as the compression metadata buffers as 
I mentioned here. And these pieces of memory should be located in a 
small region, or the performance could be badly hurt, besides, we don't 
support IOMMU for this kind of data.

Any idea about assign a small piece of memory from a pre-allocated 
memory or select region(I don't think I could reserve them in a 
DMA-heap) for a plane in the MMAP type buffer ?

Besides, I am not very satisfied with the dynamic resolution change 
steps if I understand it correct. Buffers reallocation should happen 
when we receive the event not until the drain is done. A resolution 
rising is very common when you are playing a network stream, it would be 
better that the decoder decided how many buffers it need for the 
previous sequence while the userspace could reallocate the reset of 
buffers in the CAPTURE queue.
> Other driver handle this just fine, if your v4l2 driver implement the v4l2
> resolution change mechanism, is should be very simple to manage.
> 
>>>>
>>>>>>> Signed-off-by: Hsia-Jun(Randy) Li <randy.li at synaptics.com>
>>>>>>> ---
>>>>>>>     drivers/media/v4l2-core/v4l2-common.c | 1 +
>>>>>>>     drivers/media/v4l2-core/v4l2-ioctl.c  | 2 ++
>>>>>>>     include/uapi/linux/videodev2.h        | 2 ++
>>>>>>>     3 files changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> index e0fbe6ba4b6c..f645278b3055 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-common.c
>>>>>>> @@ -314,6 +314,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
>>>>>>>                    { .format = V4L2_PIX_FMT_SGBRG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SGRBG12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>>                    { .format = V4L2_PIX_FMT_SRGGB12,       .pixel_enc = V4L2_PIXEL_ENC_BAYER, .mem_planes = 1, .comp_planes = 1, .bpp = { 2, 0, 0, 0 }, .hdiv = 1, .vdiv = 1 },
>>>>>>> +               { .format = V4L2_PIX_FMT_NV12M_V4H1C, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 5, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, .hdiv = 2, .vdiv = 2, .block_w = { 128, 128 }, .block_h = { 128, 128 } },
>>>>>>>            };
>>>>>>>            unsigned int i;
>>>>>>>
>>>>>>> diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> index e6fd355a2e92..8f65964aff08 100644
>>>>>>> --- a/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> +++ b/drivers/media/v4l2-core/v4l2-ioctl.c
>>>>>>> @@ -1497,6 +1497,8 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
>>>>>>>                    case V4L2_PIX_FMT_MT21C:        descr = "Mediatek Compressed Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC08C:        descr = "QCOM Compressed 8-bit Format"; break;
>>>>>>>                    case V4L2_PIX_FMT_QC10C:        descr = "QCOM Compressed 10-bit Format"; break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_V4H1C:  descr = "Synaptics Compressed 8-bit tiled Format";break;
>>>>>>> +               case V4L2_PIX_FMT_NV12M_10_V4H3P8C:     descr = "Synaptics Compressed 10-bit tiled Format";break;
>>>>>>>                    default:
>>>>>>>                            if (fmt->description[0])
>>>>>>>                                    return;
>>>>>>> diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
>>>>>>> index 01e630f2ec78..7e928cb69e7c 100644
>>>>>>> --- a/include/uapi/linux/videodev2.h
>>>>>>> +++ b/include/uapi/linux/videodev2.h
>>>>>>> @@ -661,6 +661,8 @@ struct v4l2_pix_format {
>>>>>>>     #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12  Y/CbCr 4:2:0 16x16 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_8L128      v4l2_fourcc('N', 'A', '1', '2') /* Y/CbCr 4:2:0 8x128 tiles */
>>>>>>>     #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /* Y/CbCr 4:2:0 10-bit 8x128 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_V4H1C v4l2_fourcc('S', 'Y', '1', '2')   /* 12  Y/CbCr 4:2:0 tiles */
>>>>>>> +#define V4L2_PIX_FMT_NV12M_10_V4H3P8C v4l2_fourcc('S', 'Y', '1', '0')   /* 12  Y/CbCr 4:2:0 10-bits tiles */
>>>>>>>
>>>>>>>     /* Bayer formats - see https://urldefense.proofpoint.com/v2/url?u=http-3A__www.siliconimaging.com_RGB-2520Bayer.htm&d=DwIFaQ&c=7dfBJ8cXbWjhc0BhImu8wVIoUFmBzj1s88r8EGyM0UY&r=P4xb2_7biqBxD4LGGPrSV6j-jf3C3xlR7PXU-mLTeZE&m=lkQiuhx0yMAYHGcW-0WaHlF3e2etMHsu-FoNIBdZILGH6FPigwSAmel2vAdcVLkp&s=JKsBzpb_3u9xv52MaMuT4U3T1pPqcObYkpHDBxvcx_4&e=   */
>>>>>>>     #define V4L2_PIX_FMT_SBGGR8  v4l2_fourcc('B', 'A', '8', '1') /*  8  BGBG.. GRGR.. */
>>>>
>>>
>>
> 

-- 
Hsia-Jun(Randy) Li