<div dir="ltr">Hi,<div>the problem with the drm.h header is, it is complicated, still needs interpretation, and it lacks some commonly used formats, (e.g YUVA4444p)</div><div>Also it doesn't address the gamma value (linear, sRGB, bt701), or the yuv subspace, (eg Y'CbCr vs bt701), the yuv ramge (16 - 240. 16 - 235 = clamped / mpeg. 0 - 255 unclamped, full, jpeg range) or uv sampling position, e.g center, top_left)</div><div><br></div><div>I can see that having some common definitions would be useful for exchanging data between applications. Eg  my app gets a frame buffer and metadata XDG_VIDEO_PALETTE_RGB24, XDG_VIDEO_GAMMA_LINEAR</div><div>then I know unambiguously that this is planar RGB 8:8:8 (so forget little / big endian) and that the values are encoded with linear (not sRGB) gamma.</div><div><br></div><div>If you want to be more specific with palettes, then you could do so, but it might require defining metadata structs,</div><div><br></div><div>For example for my own standard (Weed effects) I have:</div><div><br></div><div>// max number of channels in a palette                                                                                                                                                                                                                                                    <br>#ifndef WEED_MAXPCHANS<br>#define WEED_MAXPCHANS 8<br>#endif<br><br>// max number of planes in a palette                                                                                                                                                                                                                                                      <br>#ifndef WEED_MAXPPLANES<br>#define WEED_MAXPPLANES 4<br>#endif<br><br>#define WEED_VCHAN_end          0<br><br>#define WEED_VCHAN_red          1<br>#define WEED_VCHAN_green        2<br>#define WEED_VCHAN_blue         3<br><br>#define WEED_VCHAN_Y                    512<br>#define WEED_VCHAN_U                    513<br>#define WEED_VCHAN_V                    514<br><br>#define WEED_VCHAN_alpha                1024<br><br>#define WEED_VCHAN_FIRST_CUSTOM         8192<br><br>#define WEED_VCHAN_DESC_PLANAR          (1 << 0) ///< planar type                                                                                                                                                                                                                         <br>#define WEED_VCHAN_DESC_FP              (1 << 1) ///< floating point type                                                                                                                                                                                                                 <br>#define WEED_VCHAN_DESC_BE              (1 << 2) ///< pixel data is big endian (within each component)                                                                                                                                                                                    <br><br>#define WEED_VCHAN_DESC_FIRST_CUSTOM    (1 << 16)<br><br>typedef struct {<br>  uint16_t ext_ref;  ///< link to an enumerated type                                                                                                                                                                                                                                      <br>  uint16_t chantype[WEED_MAXPCHANS]; ///  e.g. {WEED_VCHAN_U, WEED_VCHAN_Y, WEED_VCHAN_V, WEED_VCHAN_Y)                                                                                                                                                                                   <br>  uint32_t flags; /// bitmap of flags, eg. WEED_VCHAN_DESC_FP | WEED_VCHAN_DESC_PLANAR                                                                                                                                                                                                    <br>  uint8_t  hsub[WEED_MAXPCHANS];  /// horiz. subsampling, 0 or 1 means no subsampling, 2 means halved etc. (planar only)      </div><div>   uint8_t  vsub[WEED_MAXPCHANS];  /// vert subsampling                                                                                                                                                                                                                                    <br></div>  uint8_t npixels; ///< npixels per macropixel: {0, 1} == 1                                                                                                                                                                                                                               <br>  uint8_t bitsize[WEED_MAXPCHANS]; // 8 if not specified<br>  void *extended; ///< pointer to app defined data                                                                                                                                                                                                                                        <br>} weed_macropixel_t;<br><div><br></div><div>Then I can describe all my palettes like:</div><div>advp[0] = (weed_macropixel_t) {<br>    WEED_PALETTE_RGB24,<br>    {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue}<br>  };<br><br></div><div> advp[6] = (weed_macropixel_t) {<br>    WEED_PALETTE_RGBAFLOAT,<br>    {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue, WEED_VCHAN_alpha},<br>    WEED_VCHAN_DESC_FP, {0}, {0}, 1, {32, 32, 32, 32}<br>  };<br><br></div><div> advp[7] = (weed_macropixel_t) {<br>    WEED_PALETTE_YUV420P,<br>    {WEED_VCHAN_Y, WEED_VCHAN_U, WEED_VCHAN_V},<br>    WEED_VCHAN_DESC_PLANAR, {1, 2, 2}, {1, 2, 2}<br>  };<br></div><div><br></div><div>IMO this is way superior to fourcc and if you were to supplement this with gamma, interlace, yuv subspace, yuv clamping and yuv sampling, then you would have a very comprehensive definition for any type of video frame.</div><div><br></div><div>G.</div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 4 Apr 2024 at 08:52, Pekka Paalanen <<a href="mailto:pekka.paalanen@haloniitty.fi">pekka.paalanen@haloniitty.fi</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 3 Apr 2024 21:51:39 -0300<br>
salsaman <<a href="mailto:salsaman@gmail.com" target="_blank">salsaman@gmail.com</a>> wrote:<br>
<br>
> Regarding my expertise, I was one of the developers most involved in<br>
> developing the "livido" standard which was one of the main topics of the<br>
> Piksel Festivals held in Bergen, Norway.<br>
> In the early days (2004 - 2006) the focus of the annual event was precisely<br>
> the formulation of free / open standards, in this case for video effects.<br>
> Other contributors included:<br>
>  Niels Elburg, Denis "Jaromil" Rojo, Tom Schouten, Andraz Tori, Kentaro<br>
> Fukuchi and Carlo Prelz.<br>
> I've also been involved with and put forward proposals for common command /<br>
> query / reply actions (Open Media Control). To the extent that these<br>
> proposals have not gained traction, I don't ascribe this to a failing in<br>
> the proposals, but rather to a lack of developer awareness.<br>
> <br>
> Now regarding specific areas, I went back and reviewed some of the<br>
> available material at  <a href="https://www.freedesktop.org/wiki/Specifications/" rel="noreferrer" target="_blank">https://www.freedesktop.org/wiki/Specifications/</a><br>
> <br>
> free media player specifications<br>
> <a href="https://www.freedesktop.org/wiki/Specifications/free-media-player-specs/" rel="noreferrer" target="_blank">https://www.freedesktop.org/wiki/Specifications/free-media-player-specs/</a><br>
> metadata standards for things like comments and ratings - talks mainly<br>
> about audio but describes video files also<br>
> <br>
> I am not a big fan of dbus, but this looks fine, it could be used for video<br>
> players. I'd be happier if it were a bit more abstracted and not tied to a<br>
> specific implementation (dbus). I could suggest some enhancements but I<br>
> guess this is a dbus thing and not an xdg thing.<br>
<br>
Thanks, these sound like they do not need to involve Wayland in any<br>
way, so they are not on my plate.<br>
<br>
> IMO what would be useful would be to define a common set of constants, most<br>
> specifically related to frame pixel fornats<br>
> The 2 most common in use are fourCC and avformat<br>
<br>
Wayland protocol extensions and I suppose also Wayland compositors<br>
internally standardise on drm_fourcc.h formats. Their authoritative<br>
definitions are in<br>
<a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/drm/drm_fourcc.h" rel="noreferrer" target="_blank">https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/drm/drm_fourcc.h</a><br>
and they are not intentionally mirroring any other fourcc coding.<br>
<br>
These are strictly pixel formats, and do not define anything about<br>
colorimetry, interlacing, field order, frame rate, quantization range,<br>
or anything else.<br>
<br>
> Consider a frame in UYVY fornat<br>
> <br>
> fourCC values:<br>
> <br>
>  #define MK_FOURCC(a, b, c, d) (((uint32_t)a) | (((uint32_t)b) << 8)     \<br>
>                                | (((uint32_t)c) << 16) | (((uint32_t)d) <<<br>
> 24))<br>
> <br>
> MK_FOURCC('U', 'Y', 'V', 'Y')<br>
> but also<br>
> MK_FOURCC('I', 'U', 'Y', 'B')<br>
> the same but with interlacing<br>
> MK_FOURCC('H', 'D', 'Y', 'C')<br>
> same but bt709 (hdtv) encoding<br>
> <br>
> so this requires interpretation by sender / receiver - a simpler way could<br>
> be with constants<br>
> <br>
> - probably the nearest we have are ffmpeg / libav definitions, but this is<br>
> the wrong way around, a lib shouldn't define a global standard, the<br>
> standard should come first and the lib should align to that.<br>
> <br>
> We have AV_PIX_FMT_UYVY422 which was formerly PIX_FMT_UYVY422<br>
> and AVCOL_TRC_BT709, which is actually the gamma transfer function, There<br>
> is no equivalent bt709 constant fot bt709 yuv / rgb, instead this exists as<br>
> a matrix.<br>
> <br>
> Now consider how much easier it would be to share data if we had the<br>
> following constants enumerated:<br>
> <br>
> *XDG_VIDEO_PALETTE_UYVY*<br>
> *XDG_VIDEO_INTERLACE_TOP_FIRST*<br>
> *XDG_VIDEO_YUV_SUBSPACE_BT709*<br>
> *XDG_VIDEO_GAMMA_SRGB*<br>
> <br>
> (this is an invented example, not intended to be a real example).<br>
> <br>
> There is a bit more too it but that should be enough to give a general idea.<br>
<br>
Where should this be used?<br>
<br>
<br>
Thanks,<br>
pq<br>
</blockquote></div>