Problem with seeking "subparse"

Andy Robinson andy at seventhstring.com
Thu Dec 31 16:41:01 UTC 2020


P.S. I now know why the parser accepts a blank line as a valid sequence 
number. This code:

const gchar *nptr = "";
gchar *endptr;
errno = 0;
guint64 res = g_ascii_strtoull(nptr, &endptr, 10);

on Linux does not set errno, but on Mac it sets errno to 22 (EINVAL).

Therefore in "case 0" of the parser, this line applies:

   else if (id == 0 && errno == EINVAL)
     state->state = 1;


On 31/12/2020 14:38, Andy Robinson wrote:
> GST 1.18.2 on Mac Big Sur (and for all I know it might well happen on 
> Windows too).
> 
> I find that subparse gets confused by seeking. I attach a simple 
> subtitle file Test.srt in "subrip" format, obviously you can put these 
> subtitles on whatever video you might have at hand.
> 
> The pipeline looks like this:
> 
> gst-launch-1.0 \
>     textoverlay name=ov ! autovideosink \
>     filesrc location=my-video.mp4 ! decodebin ! videoconvert ! 
> videoscale ! ov.video_sink \
>     filesrc location=Test.srt ! subparse ! ov.text_sink
> 
> but of course I am doing this programmatically and this pipeline works 
> fine if you don't "seek" it. And I don't think it's possible to seek 
> with gst-launch?
> 
> However if you programmatically seek this pipeline to 8 seconds with 
> GST_DEBUG=subparse:7 then subparse produces errors. I have attached a 
> file subparse_log.txt showing the crucial lines.
> 
> The crucial lines from the source are these, at line 1060 in the 
> function parse_subrip in gstsubparse.c, dealing with "state 2" 
> (expecting subtitle text):
> 
>        if (in_seg) {
>          state->start_time = clip_start;
>          state->duration = clip_stop - clip_start;
>        } else {
>          state->state = 0;
>          return NULL;
>        }
> 
> That is, if we are out of segment (parsing lines before the ones we are 
> interested in) then throw away the subtitle text and transition 
> immediately to state 0 (expecting sequence number). IMHO this is wrong, 
> the next thing we are in fact going to see is either another line of 
> subtitle text or a blank line.
> 
> The problem is then compounded by the fact that in state 0 the parser 
> accepts almost anything - even a blank line - as a valid sequence 
> number, and transitions to state 1 (expecting timestamps).
> 
> These two factors cause the parsing errors to cascade, often destroying 
> the first 2 or 3 timestamps that we *did* want to see.
> 
> Looking at the log I've attached, we see the segment event, start time 8 
> secs, and then:
> 
> State 0. Parsing line '1'
> State 1. Parsing line '00:00:01,000 --> 00:00:05,000'
> parse_subrip_time: parsing timestamp '00:00:01,000'
> parse_subrip_time: parsing timestamp '00:00:05,000'
> State 2. Parsing line '<i>Test message 1</i>'
>     // At this point we transition to state 0 which is wrong -
>     // we should still be in state 2, waiting for blank line.
> State 0. Parsing line ''
>     // Here we wrongly transition to state 1 because the
>     // blank line we just saw has been wrongly accepted as
>     // a valid sequence number. Now we are lost!
> State 1. Parsing line '2'
> error parsing subrip time line '2'
> State 0. Parsing line '00:00:07,000 --> 00:00:12,000'
>     // I haven't checked out why that was not accepted as a sequence
>     // number. But is wasn't because we are still in state 0.
> State 0. Parsing line '<i>Another test message'
>     // However that was accepted as a sequence number!
>     // so we transition to state 1.
> State 1. Parsing line 'on two lines</i>'
> error parsing subrip time line 'on two lines</i>'
> 
> It seems to me that two fixes are needed:
> 
> 1) The parser should only transition from state 2 to state 0 when it 
> sees a blank line.
> 
> 2) In order to re-synchronise after any error (e.g. after a format error 
> in the subtitle file), it should only transition from state 0 to state 1 
> when it sees a line with a single decimal number on it.
> 
> Can anyone suggest a workaround?
> 
> My Humax TV hard disk recorder shows the same symptoms : after a seek, 
> it is often the case that several subtitles go missing before they get 
> back in sync. I wonder why!
> 
> Regards,
> Andy Robinson, Seventh String Software, www.seventhstring.com
> 


More information about the gstreamer-devel mailing list