[gst-devel] [RFC] Encoding and Profiles

Edward Hervey bilboed at gmail.com
Mon Oct 19 21:05:17 CEST 2009


On Mon, 2009-10-19 at 10:54 -0700, Michael Smith wrote:
> >
> >  encbin = gst_element_factory_make("encodebin, NULL);
> >  g_object_set (encbin, "profile", "N900/H264 HQ", NULL);
> 
> Perhaps "profile-name" ("profile" being reserved for the profile
> object itself) would be a better name.

  Yes, having both would be nice. I'll add that.

> 
> >
> > 1.2.1 Incoming streams
> >
> >  The streams fed to EncodeBin can be of various types:
> >
> >  * Video
> >   * Uncompressed (but maybe subsampled)
> >   * Compressed
> >  * Audio
> >   * Uncompressed (audio/x-raw-{int|float})
> >   * Compressed
> >  * Timed text
> >  * Private streams
> 
> Any ideas on how this allows re-muxing (without re-encoding) of
> certain streams? This wouldn't be an essential feature for the initial
> _implementation_, but I think keeping it in mind when designing the
> APIs is pretty important.

  That is definitely going to go in the first implementation (one of the
initial tests I have in mind is the simple remux-to-same-format and
remux-in-compatible muxer).

  The current experience so far when doing transmuxing, is that we need
to have a parser (like mpegaudioparse, mpegvideoparse,...) to verify
that the stream is properly formatted, that the buffers are correctly
packetized, timestamps properly set, etc...

  So the idea is when no re-encoding is involved, to use a parser for
that stream format is present.

>  It looks like you've thought about this, but
> it's not clear from this writeup what conclusions you came to :-)



> 
> Maybe some API to query what caps are available for re-muxing given
> the current profile

Something like

/**
 * gst_encoding_profile_get_input_caps:
 * @profile: a #GstEncodingProfile
 *
 * Returns: the list of all caps the profile can accept. Caller must call
 * gst_cap_unref on all unwanted caps once it is done with the list.
 */
GList * gst_profile_get_input_caps (GstEncodingProfile *profile);

>  - then the app can check that, then either
> continue decoding if the input stream is incompatible, or pass-through
> if possible. Or should the application directly be querying the
> profile, rather than going through APIs on the bin, for this stuff?

  The generic idea is:
 * to not have any special API on EncodeBin (except for the standard
caps request, state change, .. API)
 * to delay creation of EncodeBin as late as possible and work only with
GstEncodingProfile API until then.

> 
> 
> >
> >
> > 1.2.2 Steps involved for raw video encoding
> >
> > (0) Incoming Stream
> >
> > (1) Transform raw video feed (optional)
> >
> >  Here we modify the various fundamental properties of a raw video
> >  stream to be compatible with the intersection of:
> >  * The encoder GstCaps and
> >  * The specified "Stream Restriction" of the profile/target
> >
> >  The fundamental properties that can be modified are:
> >  * width/height
> >    This is done with a video scaler.
> >    The DAR (Display Aspect Ratio) MUST be respected.
> >    If needed, black borders can be added to comply with the target DAR.
> >  * framerate
> >  * format/colorspace/depth
> >    All of this is done with a colorspace converter
> 
> With respect to framerate, any thought on VFR streams? If the target
> format supports VFR, then it'd be nice to be able to just encode the
> input as-is, without having to force it to a specified framerate.

  Hadn't thought about that one, and there are indeed use-cases where
you'd want that (webcams that change their framerate depending on the
light level for example, and you don't want to use videorate in those
cases).

 Adding a boolean variable_framerate in GstVideoEncodingProfile would be
an option then, and have it to FALSE by default.

/**
 * GstVideoEncodingProfile:
 * @profile: common #GstEncodingProfile part.
 * @pass: The pass number if this is part of a multi-pass profile. Starts at 1
 * for multi-pass. Set to 0 if this is not part of a multi-pass profile.
 * @variable_framerate: Do not enforce framerate on incoming raw stream. Default
 * is FALSE.
 */
struct _GstVideoEncodingProfile {
  GstStreamEncodingProfile      profile;
  guint                         pass;
  gboolean                      variable_framerate;
};


> 
> It'd probably also be good to have some way to select, and then set
> properties on, the elements used here. e.g. the application probably
> wants to be able to control what sort of scaling to do (to enable
> high-quality scaling, for example, or low-quality/fast for preview
> encodes). Obviously, the default would just work, so this would be
> more optional API for more advanced applications.

  That's the problem with trying to design the
one-API-to-rule-them-all :)

  Seriously though... the problem is that if we expose everything... we
just come back to square one (or maybe two, but not that far ahead).

  The other problem is also that we might have several 'converters'
available, none of them having well-known properties. Maybe some
platform might have a differently named converter, ...

  An intermediate solution might be to provide a quality/speed knob over
those conversions.
  Maybe have it as a boolean. So by default you would get the highest
quality of conversion available (if you're in a live pipeline, QoS would
kick in to lower the quality so you don't lose any data), but if you
flip that boolean, you would get a low-quality/as-fast-as-possible
conversion.


> 
> >
> > (2) Actual encoding (optional for raw streams)
> >
> >  An encoder (with some optional settings) is used.
> 
> Are you planning anything for specifying how the settings should work,
> such that a profile could contain settings that apply to several
> different encoders (probably selected by rank, or optionally forced by
> the application), or will the settings be tied to a specific element?

  Right now the settings would be tied to a specific element, since the
profile system relies exclusively on presets (which are tied to an
element) for properties of an element.

  The rationale behind this... is that we have no unified properties
across elements, let alone across encoders, let alone across different
encoders for the same format.

  I'd *LOVE* to have a unified system for properties which are common to
encoders... but every time I put my head down on that problem.. I only
see one solution : base classes for encoders (guaranteeing *some*
consistency in properties).


> >
> > The representation used here is XML only as an example. No decision is
> > made as to which formatting to use for storing targets and profiles.
> 
> Whatever decision in made as to the 'default' format for storing
> these, I'd really like to see a sufficiently complete API that an
> application that (for whatever reason) doesn't want to use that format
> could build the GstEncodingProfile object itself, from its own data
> store.

  Absolutely, the storage format which will be decided will only be the
reference one. It's important to get that one right, since the goal is
for it to be the one system-wide profiles (and those shipped in
gstreamer modules) will come in.

  BUT, we do want to leave the possibility for applications (or any
service) to create those profiles on their own.

> 
> 
> 
> >
> > <gst-encoding-target>
> >  <name>Nokia N900</name>
> >  <category>Consumer Device</category>
> >  <profiles>
> >    <profile>Nokia N900/H264 HQ</profile>
> >    <profile>Nokia N900/MP3</profile>
> >    <profile>Nokia N900/AAC</profile>
> >  </profiles>
> > </gst-encoding-target>
> >
> > <gst-encoding-profile>
> >  <name>Nokia N900/H264 HQ</name>
> >  <description>
> >    High Quality H264/AAC for the Nokia N900
> >  </description>
> >  <format>video/quicktime,variant=iso</format>
> >  <streams>
> >    <stream-profile>
> >      <type>audio</type>
> >      <format>audio/mpeg,mpegversion=4</format>
> >      <preset>Quality High/Main</preset>
> >      <restriction>audio/x-raw-int,channels=[1,2]</restriction>
> >      <presence>1</presence>
> >    </stream-profile>
> >    <stream-profile>
> >      <type>video</type>
> >      <format>video/x-h264</format>
> >      <preset>Profile Baseline/Quality High</preset>
> >      <restriction>
> >        video/x-raw-yuv,width=[16, 800],\
> >        height=[16, 480],framerate=[1/1, 30000/1001]
> >      </restriction>
> >      <presence>1</presence>
> >    </stream-profile>
> >  </streams>
> >
> > </gst-encoding-profile>
> 
> This describes the constraints on the device (or whatever). Have you
> thought at all about splitting out "constraints on what the target can
> accept" from "what we actually want to encode"?
> 
> e.g. this profile says that I can do any size (within that range)
> video, but my application wants to encode at a particular size -
> should I be replacing the caps in the profile at runtime, or should
> there be another object to represent these (somewhat different)
> concepts?

  I guess I should have put another example, where the target is a less
'flexible' device.
  Let's say your target is a portable device that only support
320x240 at 25fps, then the profile would have those very specific caps in
the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)

  Do you have any more specific example in mind with the above
use-case ? Did you mean you wanted the application to be able to
fine-tune even more the profile at runtime ?

> 
> What about constraints that are not (currently, at least) expressible
> through caps? e.g. bitrate, profiles, etc?

  Those are tunable through the presets (through which all properties
are expressed), in the N900 example above, it is set to baseline profile
and the bitrate corresponding to "Quality High".

> 
> 
> Anyway, I don't have time right now to continue through this in enough
> depth - and I'm sure some of my remarks miss something you've already
> thought about - but this was just to throw some more ideas into the
> mix.

  I'm looking forward to the rest of your comments,

> 
> I'm very happy to see you looking into this more deeply!

  Thank you

     Edward

> 
> Mike
> 
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay 
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel






More information about the gstreamer-devel mailing list