[gst-devel] [RFC] Encoding and Profiles

Edward Hervey bilboed at gmail.com
Wed Oct 28 16:44:56 CET 2009


On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote:

> >
> >  That's the problem with trying to design the
> > one-API-to-rule-them-all :)
> >
> >  Seriously though... the problem is that if we expose everything... we
> > just come back to square one (or maybe two, but not that far ahead).
> >
> >  The other problem is also that we might have several 'converters'
> > available, none of them having well-known properties. Maybe some
> > platform might have a differently named converter, ...
> >
> >  An intermediate solution might be to provide a quality/speed knob over
> > those conversions.
> >  Maybe have it as a boolean. So by default you would get the highest
> > quality of conversion available (if you're in a live pipeline, QoS would
> > kick in to lower the quality so you don't lose any data), but if you
> > flip that boolean, you would get a low-quality/as-fast-as-possible
> > conversion.
> 
> I was basically just thinking of something along the lines of what
> playbin does - by default, just do something sane (here, videoscale
> with the scaling mode set to the highest-quality mode), but also allow
> the app to just say "use this element instead" - the app can then
> override things as much as it wants to, but it doesn't _have to_,
> since the defaults work sensibly.
> 

 Something like this ?

videoscale          : The video scaler to use for converting video 
                      streams (if needed).
                      flags: readable/writable
                      Object of type "GstElement" (default : videoscale)
videocolorspace       The colorspace converter to use
                      flags: readable/writable
                      Object of type "GstElement" (default :
ffmpegcolorspace)



> 
> 
> >
> >
> >>
> >> >
> >> > (2) Actual encoding (optional for raw streams)
> >> >
> >> >  An encoder (with some optional settings) is used.
> >>
> >> Are you planning anything for specifying how the settings should work,
> >> such that a profile could contain settings that apply to several
> >> different encoders (probably selected by rank, or optionally forced by
> >> the application), or will the settings be tied to a specific element?
> >
> >  Right now the settings would be tied to a specific element, since the
> > profile system relies exclusively on presets (which are tied to an
> > element) for properties of an element.
> >
> >  The rationale behind this... is that we have no unified properties
> > across elements, let alone across encoders, let alone across different
> > encoders for the same format.
> >
> >  I'd *LOVE* to have a unified system for properties which are common to
> > encoders... but every time I put my head down on that problem.. I only
> > see one solution : base classes for encoders (guaranteeing *some*
> > consistency in properties).
> 
> Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
> the profile I use will have something to do with the user's
> configuration and the device we're transcoding for - but the actual
> elements available to satisfy that profile will be different across
> platforms and depend on what things the user has installed.
> 
> I can't see myself using this system if it's tightly tied to specific
> elements for encoders (parsers, muxers, decoders, scalers, etc are
> less problematic), which suggests that we do need _some_ mechanism to
> use these things across multiple elements, even if it requires custom
> application code rather than being automatic.

  One way (specifying element names and properties) or the other
(specifying caps and presets), it's going to require some custom work to
be done.

  The reason I prefer/recommend going the caps/presets way is that most
of the work will be done *in* the element and preset, and much less (if
not none) in the profiles and applications.

> 
> >> This describes the constraints on the device (or whatever). Have you
> >> thought at all about splitting out "constraints on what the target can
> >> accept" from "what we actually want to encode"?
> >>
> >> e.g. this profile says that I can do any size (within that range)
> >> video, but my application wants to encode at a particular size -
> >> should I be replacing the caps in the profile at runtime, or should
> >> there be another object to represent these (somewhat different)
> >> concepts?
> >
> >  I guess I should have put another example, where the target is a less
> > 'flexible' device.
> >  Let's say your target is a portable device that only support
> > 320x240 at 25fps, then the profile would have those very specific caps in
> > the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
> >
> >  Do you have any more specific example in mind with the above
> > use-case ? Did you mean you wanted the application to be able to
> > fine-tune even more the profile at runtime ?
> 
> Yeah - I want the app to fine tune.
> 
> Let's suppose the following use-case:
>  - User has an input video they got from the internet somewhere. It's
> 640x480, 30 fps, theora.
>  - User has a mobile phone that can play H.264 video at up to 720p (so
> it can do the video at this resolution).
>  - User wants to encode it at 320x240 to fit on their little micro-sd card.
> 
> So the video is already a supported size - but the application wants
> to scale it smaller because the user has chosen that option - I don't
> quite understand how that's meant to be expressed in your profiles/API
> right now (I might be missing something).

  In that case, your target for that device would have a restriction
caps along the looks of :
   video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1,
1000/1]

  Meaning that your device can playback any videos between 16x16 and
1280x720.

  The example I gave before with 320x240 at 25 would be the case for
devices that can only playback one and only resolution/fps.

> 
> 
> >
> >>
> >> What about constraints that are not (currently, at least) expressible
> >> through caps? e.g. bitrate, profiles, etc?
> >
> >  Those are tunable through the presets (through which all properties
> > are expressed), in the N900 example above, it is set to baseline profile
> > and the bitrate corresponding to "Quality High".
> 
> But the presets specify a particular set of settings, not the target
> constraints. So there's no way to say "this device supports bitrates
> up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
> I think?
> 
> I don't think the element presets are particularly helpful here - they
> express "a particular configuration" not "a range of possibilities".


   Are you saying presets don't satisfy all the requirements here ? I
completely agree. But apart from trying to extend presets there's only
one ugly other option:
  * make sure all properties (like bitrate) for encoders have the
SAME/EXACT/GUARANTEED name and meaning and range ...

 ... and that one doesn't seem trivial either.


        Edward

> 
> Mike






More information about the gstreamer-devel mailing list