Questions regarding a new parser element based on GstBaseParse

Wed Jul 6 06:02:30 UTC 2016

On Tue, 5 Jul 2016, at 11:10 PM, Carlos Rafael Giani wrote:
> Hello,
> 
> I want to add a parser for a format and use GstBaseParse as base class. 
> However, this format has some peculiarities:
> 
> 1) Audio data comes in 2048-byte blocks. No metadata between blocks, but 
> one block equals the data for one channel. So, if this is stereo 
> content, then I have to read 2*2048 bytes in order to output properly 
> interleaved data. When I do TIME->BYTES conversions in the convert() 
> vfunc, I plan on first doing the time->bytes conversion as usual (bytes 
> = time * (sample_rate * bytes_per_sample * num_channels / GST_SECOND)), 
> and then round down the result to an aligned value. So, with the stereo 
> example from earlier, if for example the conversion yields a byte offset 
> value of 6511, I would round it down to 4096, to ensure seeking does not 
> end in the middle of a block. Does this make sense, or is there a better 
> way?

The convert vfunc is the wrong place to encode this logic.

I'm not sure how we handle seeking into the middle of a frame. I guess
you could (based on the offset or byte position) return DROP and the
appropriate skip length.

> 2) Each block consists of 16-bit words, which are essentially the 
> samples. I guess this means that when doing TIME->DEFAULT conversion, I 
> should still do the conversion like this: default = time * (sample_rate 
> / GST_SECOND) , correct? DEFAULT is supposed to mean "sample" or "frame" 
> with audio data, right?

Yes. From the baseparse documentation:

"""
This base class uses GST_FORMAT_DEFAULT as a meaning of frames. So,
subclass conversion routine needs to know that conversion from
GST_FORMAT_TIME to GST_FORMAT_DEFAULT must return the frame number that
can be found from the given byte position.
"""

> 3) At first, I have to read the headers. I plan on setting the 
> min_frame_size to the size of the first header, read its contents, add 
> the DROP flag to the GstBaseParseFrame, and return 
> GST_BASE_PARSE_FLOW_DROPPED in handle_frame(). Is this the 
> correct/recommended way?

You don't need to set the DROP flag (that's for when you decide to drop
the frame in finish_frame()). Returning GST_BASE_PARSE_FLOW_DROPPED is
sufficient.

If I understand your format right, it should be okay for you to first
set min_frame_size() to the header size, and then to 4096 so you
subsequently get enough data in each call to generate the interleaved
data.

> 4) Is it possible that a seek query comes in while I am still scanning 
> the headers, that is, before I even finished a frame without dropping 
> it? If so, what happens then?

You could set gst_base_parse_set_syncable() to FALSE until you have
sufficient information to identify headers.

> 5) There might be trailing padding data. I therefore need to know what 
> the current position is (in BYTES) to ensure that this trailing data is 
> excluded and the EOS is sent when the end of the valid data is reached. 
> What is the proper way of doing this? gst_base_parse_set_duration() does 
> not seem appropriate for this (and also I anyway want to use it to let 
> the application know about the duration in nanoseconds).

Would you have this information on the buffer via GST_BUFFER_OFFSET()?

> 6) In addition to this trailing padding data, this format allows for 
> id3v2 content ... at the end of the file. id3demux does not seem to 
> support this - but ID3v2 does (at least in version 2.4), although 
> admittedly, support for ID3v2 tags at the end of files is uncommon. I 
> guess a patch for id3demux would be the best approach here?

Indeed.

> This format never allows for streaming, that is, the media always is of 
> a known and finite length.
> 
> Is it still a good idea to use baseparse? Or should I actually use 
> GstElement in this case?

It looks like baseparse should work.

-- Arun