[Bug 725536] encoder: add h264 scalable encoder

Fri Jul 15 11:29:56 UTC 2016

https://bugzilla.gnome.org/show_bug.cgi?id=725536

--- Comment #9 from sreerenj <bsreerenj at gmail.com> ---
The full-blown svc implementation is tricky, and requires many driver changes
too. But we can have a restricted temporal svct-t with out
disturbing the driver. At the same time, since the driver can't support svc
specification as it is right now, it won't be exposing
SVC profile capability. So the plan is to use the temporal-encoding inside avc. 
This will give many advantages:

1: Currently we only have prediction from immediate previous/future frames, the
new feature will give more tuning options for the user
2: Encoded stream will be decodeable with any legacy h264 decoder since we are
not exposing SVC profile
3: If needed, clients (decoder, broadcasters) can still select the required
temporal levels using Scalability SEI headers.
4: Later we can easily make it as svc stream with out changing the already
implemented code by inserting few packed headers + setting profile as svc.

====== If we have to advertise the encoded stream as H264-SVC ======

There is no profile called SVC-T, so we need to expose one of the profile
mentioned in spec (svc-constrained-baseline,svc-high-profile etc for eg).
Driver is not exposing the svc capability through svc profile. So we by-pass
driver capability checking in gstreamer-vaapi,
and expose the encoded stream profile as SVC, also insert PREFIX_NAL units
which can be dropped by legacy h264 decoders.
But keep in mind, the stream won't be decodeable with current vaapidecode or
many other existing avc decoders unless it is exposing
svc capability through svc-prfoile. Of course We can make them decode the base
layer which should still expose avc profile (#732266).
This (making stream as svc) can be done at any time with out changing
hierarchical-p/b encode code block.

===== Current plan =================

-- Add hierarchical-p frame prediction model in AVC encode
eg: with 3 temporal layers

T3:             P1            P3              P5              P7

T2:                   P2                              P6

T1:   P0                                P4                        P8

T1, T2, T3: Temporal Layers
P1...pn:   P-Frames:
Frames in each Tx will reference only pictures in the lower or same layer, so
depends on bandwidth clients can drop the upper layers.
P0->P1 , P0->P2, P2->P3, P0->P4......repeat

-- Add hierarchical-b frame prediction model in AVC encode
http://www.hhi.fraunhofer.de/departments/video-coding-analytics/research-groups/image-video-coding/research-topics/svc-extension-of-h264avc/hierarchical-prediction-structures.html

These are the two well known reference modeling used in industry. Both will be
normal avc streams , but encoded as different temporal layers. And we will pack
SEI headers too for clients to deal with
temporal levels if needed, which means ability to drop higher temporal levels
without affecting the decodability of lower levels.

vaapih264enc will have two new properties:
prediction-type     : Reference Picture Selection Types
                        flags: readable, writable
                        Enum "GstVaapiEncoderH264PredictionType" Default: 0,
"default"
                           (0): default          - Default prev/next frame as
ref frames 
                           (1): hierarchical-p   - Hierarchical P frame encode
                           (2): hierarchical-b   - Hierarchical B frame encode
temporal-level      : Number of temporal levels included in the encoded stream
                        flags: readable, writable
                        Unsigned Integer. Range: 1 - 4 Default: 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.