Wrong PTS in AppSink and duplicate frames

Fri Apr 10 22:33:17 UTC 2020

Hello everyone,

I've been struggling with this for months now. I've looked in all the 
GStreamer docs, read the whole Internet, but couldn't find a cure. I 
would really be grateful for some help. Thanks in advance.

Here is what I'm trying to accomplish (simplified):

- seek to an exact decoded raw audio frame position
- read a number of frames
- (do some processing)
- seek exactly past the previously read frames
- read some more frames
- (and so on)

What I'm expecting to see:
- A contiguous buffer seamlessly memcpy'd together from AppSink's 
buffers, that is basically a perfect copy of the decoded raw audio data. 
This works OK for WAV files.

This is what happens instead:
- For FLAC: after a seek, the first GstSample that I pull from AppSink 
is always offset by exactly 1 frame, so I get a duplicate between the 
end of the previous call to my read function and the current one
- For various other formats: most GstSamples' GstBuffer->pts are either 
-1, 0, or +1 frames of their expected position - so I either get 1 
duplicate, get it right, or I'm 1 frame short

I've got the same results with hundreds of various test files, GStreamer 
1.14.5 and 1.16.2, on both 32 and 64bit versions of Linux.

*****************************************************************************************

Here is the pipeline:
---------------------

"filesrc location=FILE ! decodebin ! audioconvert ! audio/x-raw ! 
appsink name=sink sync=FALSE"

And here is my read function (simplified):
------------------------------------------

void read(gchar *lBuffer, guint64 nStartFrame, guint64 nFramesToRead, 
GstAudioInfo *pAudioInfo, GstPipeline *pPipeline)
{
     guint64 nBytesToRead = nFramesToRead * pAudioInfo->bpf;
     guint64 nBytesRead = 0;
     // We need this temp buffer because of the offset thing
     gchar *lBytes = g_malloc(nBytesToRead + 8192);
     guint nOffsetBytes = 0;

     //We are paused here

     gint64 nTime = GST_FRAMES_TO_CLOCK_TIME(nStartFrame, 
pAudioInfo->rate);
     gst_element_seek_simple(GST_ELEMENT_CAST(pPipeline), 
GST_FORMAT_TIME, GST_SEEK_FLAG_ACCURATE | GST_SEEK_FLAG_FLUSH, nTime);
     gst_element_get_state(GST_ELEMENT_CAST(pPipeline), NULL, NULL, 
GST_CLOCK_TIME_NONE);

     gst_element_query_position(GST_ELEMENT_CAST(pPipeline), 
GST_FORMAT_TIME, &nTime);

     // Very important to continue where we left off
     g_assert(GST_CLOCK_TIME_TO_FRAMES(nTime, pAudioInfo->rate) == 
nStartFrame);

     gst_element_set_state(GST_ELEMENT_CAST(pPipeline), 
GST_STATE_PLAYING);
     GstAppSink *pAppSink = 
GST_APP_SINK_CAST(gst_bin_get_by_name(GST_BIN_CAST(pPipeline), "sink"));

     while (!gst_app_sink_is_eos(pAppSink))
     {
         GstSample *pSample = gst_app_sink_pull_sample(pAppSink);

         if (pSample)
         {
             GstBuffer *pBuffer = gst_sample_get_buffer(pSample);
             gint64 nOfset = GST_CLOCK_TIME_TO_FRAMES(pBuffer->pts, 
pAudioInfo->rate);
             gint64 nWantOffset = (nBytesRead / pAudioInfo->bpf) + 
nStartFrame;

             if (nOfset > nWantOffset)
             {
                 //WAV and FLAC never get here
                 g_error("Bad pBuffer->pts");
             }
             else if (nOfset < nWantOffset && nOffsetBytes == 0)
             {
                 // FLAC reports multiple bad PTSs, but only the first 
one must be corrected
                 nOffsetBytes = (nWantOffset - nOfset) * pAudioInfo->bpf;
             }

             GstMapInfo pMapInfo;
             gboolean bSuccess = gst_buffer_map(pBuffer, &pMapInfo, 
GST_MAP_READ);

             if (bSuccess)
             {
                 memcpy(lBytes + nBytesRead, pMapInfo.data, 
pMapInfo.size);
                 nBytesRead += pMapInfo.size;
                 gst_buffer_unmap(pBuffer, &pMapInfo);
             }
             else
             {
                 g_error("gst_buffer_map failed");
             }

             gst_sample_unref(pSample);
         }
         else
         {
             g_error("gst_app_sink_pull_sample failed");
         }

         // Normally, it would be: if (nBytesRead == nBytesToRead)
         if (nBytesRead >= nBytesToRead + nOffsetBytes)
         {
             break;
         }
     }

     gst_object_unref(pAppSink);
     gst_element_set_state(GST_ELEMENT_CAST(pPipeline), 
GST_STATE_PAUSED);

     memcpy(lBuffer, lBytes + nOffsetBytes, nBytesToRead);
     g_free(lBytes);
}