[gst-devel] Question on codec_data for H.264 when ES has AnnexB package
felipe.contreras at gmail.com
Thu Dec 24 18:25:31 CET 2009
On Wed, Dec 16, 2009 at 9:38 AM, Chen, Weian <weian.chen at intel.com> wrote:
> I can get the H.264 ES stream (AnnexB Package) from HW encoder in my encoder
> element, and I also can get sps/pps NAL unit from the ES, but when I fill
> the codec_data (decoder configuration record) according to spec, then write
> the stream into mp4 file format, but the dump *.mp4 is not playable.
> And the difference between my HW encoder and x264enc is the ES output from
> HW is AnnexB package and x264enc isn’t.
> Could anybody here tell me the reason? Can AnnexB stream have codec_data?
No. AnnexB is also named "bytestream" format, and it doesn't contain
codec_data. The codec_data part is defined in MPEG-4 part 15, and the
stream format is different from Annex B. AFAIK if you are going to
save to an MP4 container you need to specify bytestream=false (x264enc
has that option), however, this should have been introduced in the
caps so that it's negotiated automatically. Perhaps for GStreamer
I wrote this document to clarify the different H.264 formats. I hope
you find it useful.
This document tries to summarize high-level formats of H.264 to define the
interfaces between encoders, decoders, muxers, demuxers, payloaders, and
The H.264 standard by ITU-T defines only the byte stream format (Annex B)
which can used standalone, or in dummy containers (AVI) in order to avoid
The MPEG-4 part 15 specification (ISO/IEC 14496-15) on section 5.2.4 defines
the codec-data format (decoder configuration information). This is mostly used
by smart containers (MPEG-4, Matroska).
The codec-data includes information, such as profile, level, sps and pps.
Also, it specifies the size of NAL unit lenght field, which is prefixed before
each NAL unit. For example, if the size corresponds to 4 bytes, then a
100-byte NAL unit would be prefixed with 0x00000064.
The RTP Payload Format for H.264 Video specification (RFC3984) in section 1.1
makes perfectly clear that neither byte-stream or the file-format are relevant
for RTP; a payloader would only care about NAL units.
Now that all the formats have been defined it's only sensible to use these for
interfacing between different components. Let's keep in mind that each
component should be logically independent: do one job, and one job only.
A demuxer should be codec-agnostic, therefore, it would output only whatever
is stored as it is.
A decoder should be able to receive whatever the demuxer outputs: both
byte-stream and AVC file-format. If in file-format, then receive the
codec-data in separate, appropriately identified buffer.
A muxer should mirror the demuxer; store whatever it receives.
An encoder should mirror the decoder; provide decodable frames either in
byte-stream, or AVC file-format.
A payloader should be able to receive whatever the encoder produces, possibly
one of the two formats would be enough, but ideally both should be supported.
A depayloader should produce decodable frames, either format should work.
codec-specific configuration data, also known as 'extra-data' and
'CODECCONFIG' in OpenMAX IL.
More information about the gstreamer-devel