Understanding my pipeline - unknown avdec_h264 output format

Thu Dec 19 01:58:21 UTC 2019

Le mercredi 18 décembre 2019 à 10:40 -0600, andis a écrit :
> Hi all, 
> I have set up a GStreamer Pipeline in ROS, C++ as follows: 
> Sending side: 
> - appsrc (is a PTGrey Camera that emits YUV422packed raw images that I push
> to the pipeline)
> - capsfilter
> - videoconvert
> - x264enc
> - rtph264pay
> - udpsink
> 
> Receiving side: 
> - udpsrc
> - rtph264depay
> - avdec_h264
> - videoconvert
> - xvimagesink
> 
> This pipeline works well, and I can have my video feed being displayed by
> the xvimagesink. 
> 
> However, I actually want have the raw/decoded image frames at the end of the
> receiving side again. My camera is publishing at a resolution of 1936x1464
> px. In the YUV422packed color space (that is the GStreamer UYVY video
> format) I have 2 Byte/px. Thus, I am pushing a buffer of size 5,668,608
> Bytes into the pipeline. 
> 
> Now I attached identity modules to my receiving side, and I get buffers of
> size 6,172,672 Bytes (larger than the raw image) after the avdec_h264 and of
> size 4,251,456 Bytes (smaller than the raw image) after the videoconvert
> before the xvimagesink. I assume there is some metadata in the buffers. How
> can i receive the raw, decoded image frames (of size 5,668,608 in the UYVY
> video format) from these buffers again? 

H264 CODEC have this particularity that they can only encode in tiles
of 16x16, the excess is being cropped/ignored. So 1936x1464 will be
encoded as 1936x1472. While the extra line will be ignored at
rendering, the buffer size will be larger. On top of which, some
formats require extra padding, and GStreamer may add it's own sauce to
that.

So, your camera is UYVY, a 4:2:2 subsample YUV formats. That means you
have 16bits per pixels, so 1936×1464×2 which is 5,668,608 Bytes.

This gets encoded in 4:2:2, and then transferred and finally decoded in
4:2:2. But avdec_h264 decides to decode in a format called Y42B. This
is 3 seperate planes for the Y, U and V. Looking into the GstVideoMeta
after the coder, we can see:

(gdb) p *gst_buffer_get_video_meta(buffer)
. . . 
  format = GST_VIDEO_FORMAT_Y42B,
. . . 
  width = 1936,
  height = 1464,
  n_planes = 3,
  offset = {0, 3020800, 4531200, 0},
  stride = {2048, 1024, 1024, 0},
. . . 

So basically you're video has been padded by ffmpeg or GStreamer (I
haven't checked) to 2048x1475. Not sure why it does that, might be a
bit bogus, not broken though.

Finally, you send this data to xvimagesink, which would seem to only
support 4:2:0 format (probably I420). This is a 12bits per pixel
format. This requires at least 1936 × 1464 × 1.5, which is 4,251,456
Bytes (that matches your observation.

> 
> Thanks, 
> Andreas
> 
> 
> 
> --
> Sent from: http://gstreamer-devel.966125.n4.nabble.com/
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel