Identifying h264 frame types?
Will McElderry
wm-gstreamer at switchd.net
Mon Feb 5 00:32:50 UTC 2024
<snippet>
> Encoded H.264 frames coming out of qtdemux should have the DELTA_UNIT
> flag set on buffers if it's a P/B frame, and should have the
> DELTA_UNIT flag cleared (not set) if it's a key/IDR frame. This will
> be based on information in the container.
> Tim
</snippet>
Hi All,
@Tim: Thanks for your input! - it's really appreciated and /almost/
works! Certainly much faster too.
Short version:
I now have two methods to identify key-frames and they don't agree!
I suspect the new method (inspecting data from qtdemux for DELTA_UNIT
flag) is not working exactly as I expect as the first frame in a file
appears never to be a key frame, and it seems _one_ frame early compared
to seek( KEY_UNIT | SNAP_AFTER | FLUSH) method of identifying key frames
-- but I'm also suspicious of my limited knowledge too!
Can anyone comment?
Thank you in advance!
More details:
I have two methods to identify key frames:
1. seeking to previous key frame PTS time (initially time 0) and using
flags: Gst.SeekFlags.KEY_UNIT | Gst.SeekFlags.SNAP_AFTER |
Gst.SeekFlags.FLUSH to successively identify all key-frames in the file
(very slow, especially when decoding frames!)
2. using a pipeline of the form:
filesrc location=my.mp4 ! qtdemux ! tee name=t
t. ! video/x-h264 ! queue ! appsink name=h264_appsink
t. ! h264parse ! nvh264dec ! queue ! appsink
name=frame_data_appsink
(acknowledgement: missing out a couple of elements for clarity. Also
I have tried moving 'h264parse' before the tee - same results)
In this method I inspect the sample I pull from h264_appsink to see
if the frame has the DELTA_UNIT flag,
As a sanity check, I compare the stream time for the sample obtained
from frame_data_appsink - (both samples timestamps always match)
Input video:
Generated using:
appsrc ! video/x-raw,...,framerate=15 ! nvh264enc gop-size=15 ... !
h264parse ! mp4mux ! filesink location=...
(again: hiding details in attempt to increase clarity)
What I see:
NB: times are given in ms as I find them easier to read.
method 1 yields frame times: [66.66, 1066.66, 2066.66, 3066.66, 4066.66,
...]
frame indices: [0, 15, 30, 45, 60, ...]
method 2 yields frame times: [1000,2000,3000,4000, ...]
frame indices: [14,29,44,59, ...]
What I expect:
I'm doing something wrong, but I cannot see what.
I intuitively expect a file to start with an I-frame, then after every
GOP frames, another I-frame (though I also expect a heuristic may be
used to identify 'good times' to introduce new I-frames may mess that
up).
That would tie up with results from method 1 to identify I-frames, but
then, why would this new method be off by one frame? (one frame early,
and missing the first key frame)
Can I reliably assume the flag is one frame early?
My intuition isn't exactly worth much though as I don't have knowledge
in this area, so I'd not be too surprised to hear I'm looking in the
wrong place.
Can anyone who knows more comment?
NB: In case it's relevant, I'm running on version 1.20.3 from Ubuntu
22.04.
Thanks again!
Will.
More information about the gstreamer-devel
mailing list