Video standards

Thu Apr 4 20:13:40 UTC 2024

Hi,
the problem with the drm.h header is, it is complicated, still needs
interpretation, and it lacks some commonly used formats, (e.g YUVA4444p)
Also it doesn't address the gamma value (linear, sRGB, bt701), or the yuv
subspace, (eg Y'CbCr vs bt701), the yuv ramge (16 - 240. 16 - 235 = clamped
/ mpeg. 0 - 255 unclamped, full, jpeg range) or uv sampling position, e.g
center, top_left)

I can see that having some common definitions would be useful for
exchanging data between applications. Eg  my app gets a frame buffer and
metadata XDG_VIDEO_PALETTE_RGB24, XDG_VIDEO_GAMMA_LINEAR
then I know unambiguously that this is planar RGB 8:8:8 (so forget little /
big endian) and that the values are encoded with linear (not sRGB) gamma.

If you want to be more specific with palettes, then you could do so, but it
might require defining metadata structs,

For example for my own standard (Weed effects) I have:

// max number of channels in a palette

#ifndef WEED_MAXPCHANS
#define WEED_MAXPCHANS 8
#endif

// max number of planes in a palette

#ifndef WEED_MAXPPLANES
#define WEED_MAXPPLANES 4
#endif

#define WEED_VCHAN_end          0

#define WEED_VCHAN_red          1
#define WEED_VCHAN_green        2
#define WEED_VCHAN_blue         3

#define WEED_VCHAN_Y                    512
#define WEED_VCHAN_U                    513
#define WEED_VCHAN_V                    514

#define WEED_VCHAN_alpha                1024

#define WEED_VCHAN_FIRST_CUSTOM         8192

#define WEED_VCHAN_DESC_PLANAR          (1 << 0) ///< planar type

#define WEED_VCHAN_DESC_FP              (1 << 1) ///< floating point type

#define WEED_VCHAN_DESC_BE              (1 << 2) ///< pixel data is big
endian (within each component)

#define WEED_VCHAN_DESC_FIRST_CUSTOM    (1 << 16)

typedef struct {
  uint16_t ext_ref;  ///< link to an enumerated type

  uint16_t chantype[WEED_MAXPCHANS]; ///  e.g. {WEED_VCHAN_U, WEED_VCHAN_Y,
WEED_VCHAN_V, WEED_VCHAN_Y)

  uint32_t flags; /// bitmap of flags, eg. WEED_VCHAN_DESC_FP |
WEED_VCHAN_DESC_PLANAR

  uint8_t  hsub[WEED_MAXPCHANS];  /// horiz. subsampling, 0 or 1 means no
subsampling, 2 means halved etc. (planar only)
   uint8_t  vsub[WEED_MAXPCHANS];  /// vert subsampling

  uint8_t npixels; ///< npixels per macropixel: {0, 1} == 1

  uint8_t bitsize[WEED_MAXPCHANS]; // 8 if not specified
  void *extended; ///< pointer to app defined data

} weed_macropixel_t;

Then I can describe all my palettes like:
advp[0] = (weed_macropixel_t) {
    WEED_PALETTE_RGB24,
    {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue}
  };

 advp[6] = (weed_macropixel_t) {
    WEED_PALETTE_RGBAFLOAT,
    {WEED_VCHAN_red, WEED_VCHAN_green, WEED_VCHAN_blue, WEED_VCHAN_alpha},
    WEED_VCHAN_DESC_FP, {0}, {0}, 1, {32, 32, 32, 32}
  };

 advp[7] = (weed_macropixel_t) {
    WEED_PALETTE_YUV420P,
    {WEED_VCHAN_Y, WEED_VCHAN_U, WEED_VCHAN_V},
    WEED_VCHAN_DESC_PLANAR, {1, 2, 2}, {1, 2, 2}
  };

IMO this is way superior to fourcc and if you were to supplement this with
gamma, interlace, yuv subspace, yuv clamping and yuv sampling, then you
would have a very comprehensive definition for any type of video frame.

G.

On Thu, 4 Apr 2024 at 08:52, Pekka Paalanen <pekka.paalanen at haloniitty.fi>
wrote:

> On Wed, 3 Apr 2024 21:51:39 -0300
> salsaman <salsaman at gmail.com> wrote:
>
> > Regarding my expertise, I was one of the developers most involved in
> > developing the "livido" standard which was one of the main topics of the
> > Piksel Festivals held in Bergen, Norway.
> > In the early days (2004 - 2006) the focus of the annual event was
> precisely
> > the formulation of free / open standards, in this case for video effects.
> > Other contributors included:
> >  Niels Elburg, Denis "Jaromil" Rojo, Tom Schouten, Andraz Tori, Kentaro
> > Fukuchi and Carlo Prelz.
> > I've also been involved with and put forward proposals for common
> command /
> > query / reply actions (Open Media Control). To the extent that these
> > proposals have not gained traction, I don't ascribe this to a failing in
> > the proposals, but rather to a lack of developer awareness.
> >
> > Now regarding specific areas, I went back and reviewed some of the
> > available material at  https://www.freedesktop.org/wiki/Specifications/
> >
> > free media player specifications
> > https://www.freedesktop.org/wiki/Specifications/free-media-player-specs/
> > metadata standards for things like comments and ratings - talks mainly
> > about audio but describes video files also
> >
> > I am not a big fan of dbus, but this looks fine, it could be used for
> video
> > players. I'd be happier if it were a bit more abstracted and not tied to
> a
> > specific implementation (dbus). I could suggest some enhancements but I
> > guess this is a dbus thing and not an xdg thing.
>
> Thanks, these sound like they do not need to involve Wayland in any
> way, so they are not on my plate.
>
> > IMO what would be useful would be to define a common set of constants,
> most
> > specifically related to frame pixel fornats
> > The 2 most common in use are fourCC and avformat
>
> Wayland protocol extensions and I suppose also Wayland compositors
> internally standardise on drm_fourcc.h formats. Their authoritative
> definitions are in
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/drm/drm_fourcc.h
> and they are not intentionally mirroring any other fourcc coding.
>
> These are strictly pixel formats, and do not define anything about
> colorimetry, interlacing, field order, frame rate, quantization range,
> or anything else.
>
> > Consider a frame in UYVY fornat
> >
> > fourCC values:
> >
> >  #define MK_FOURCC(a, b, c, d) (((uint32_t)a) | (((uint32_t)b) << 8)
>  \
> >                                | (((uint32_t)c) << 16) | (((uint32_t)d)
> <<
> > 24))
> >
> > MK_FOURCC('U', 'Y', 'V', 'Y')
> > but also
> > MK_FOURCC('I', 'U', 'Y', 'B')
> > the same but with interlacing
> > MK_FOURCC('H', 'D', 'Y', 'C')
> > same but bt709 (hdtv) encoding
> >
> > so this requires interpretation by sender / receiver - a simpler way
> could
> > be with constants
> >
> > - probably the nearest we have are ffmpeg / libav definitions, but this
> is
> > the wrong way around, a lib shouldn't define a global standard, the
> > standard should come first and the lib should align to that.
> >
> > We have AV_PIX_FMT_UYVY422 which was formerly PIX_FMT_UYVY422
> > and AVCOL_TRC_BT709, which is actually the gamma transfer function, There
> > is no equivalent bt709 constant fot bt709 yuv / rgb, instead this exists
> as
> > a matrix.
> >
> > Now consider how much easier it would be to share data if we had the
> > following constants enumerated:
> >
> > *XDG_VIDEO_PALETTE_UYVY*
> > *XDG_VIDEO_INTERLACE_TOP_FIRST*
> > *XDG_VIDEO_YUV_SUBSPACE_BT709*
> > *XDG_VIDEO_GAMMA_SRGB*
> >
> > (this is an invented example, not intended to be a real example).
> >
> > There is a bit more too it but that should be enough to give a general
> idea.
>
> Where should this be used?
>
>
> Thanks,
> pq
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/xdg/attachments/20240404/34e62dbf/attachment.htm>