[gst-devel] [RFC] Encoding and Profiles

Stefan Kost ensonic at hora-obscura.de
Thu Oct 29 19:31:06 CET 2009

Edward Hervey schrieb:
> On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote:
>>>  That's the problem with trying to design the
>>> one-API-to-rule-them-all :)
>>>  Seriously though... the problem is that if we expose everything... we
>>> just come back to square one (or maybe two, but not that far ahead).
>>>  The other problem is also that we might have several 'converters'
>>> available, none of them having well-known properties. Maybe some
>>> platform might have a differently named converter, ...
>>>  An intermediate solution might be to provide a quality/speed knob over
>>> those conversions.
>>>  Maybe have it as a boolean. So by default you would get the highest
>>> quality of conversion available (if you're in a live pipeline, QoS would
>>> kick in to lower the quality so you don't lose any data), but if you
>>> flip that boolean, you would get a low-quality/as-fast-as-possible
>>> conversion.
>> I was basically just thinking of something along the lines of what
>> playbin does - by default, just do something sane (here, videoscale
>> with the scaling mode set to the highest-quality mode), but also allow
>> the app to just say "use this element instead" - the app can then
>> override things as much as it wants to, but it doesn't _have to_,
>> since the defaults work sensibly.
>  Something like this ?
> videoscale          : The video scaler to use for converting video 
>                       streams (if needed).
>                       flags: readable/writable
>                       Object of type "GstElement" (default : videoscale)
> videocolorspace       The colorspace converter to use
>                       flags: readable/writable
>                       Object of type "GstElement" (default :
> ffmpegcolorspace)

Yes, something we should also do on playbin2, camerabin. Maybe we could also
have a autotransform elements that gets a klass and picks the highest ranked
element from the class (don't think we want autovideoscale, autocolorspace, ...)

Not sure if we should name them video-scale and colorspace-convert.


>>>>> (2) Actual encoding (optional for raw streams)
>>>>>  An encoder (with some optional settings) is used.
>>>> Are you planning anything for specifying how the settings should work,
>>>> such that a profile could contain settings that apply to several
>>>> different encoders (probably selected by rank, or optionally forced by
>>>> the application), or will the settings be tied to a specific element?
>>>  Right now the settings would be tied to a specific element, since the
>>> profile system relies exclusively on presets (which are tied to an
>>> element) for properties of an element.
>>>  The rationale behind this... is that we have no unified properties
>>> across elements, let alone across encoders, let alone across different
>>> encoders for the same format.
>>>  I'd *LOVE* to have a unified system for properties which are common to
>>> encoders... but every time I put my head down on that problem.. I only
>>> see one solution : base classes for encoders (guaranteeing *some*
>>> consistency in properties).
>> Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
>> the profile I use will have something to do with the user's
>> configuration and the device we're transcoding for - but the actual
>> elements available to satisfy that profile will be different across
>> platforms and depend on what things the user has installed.
>> I can't see myself using this system if it's tightly tied to specific
>> elements for encoders (parsers, muxers, decoders, scalers, etc are
>> less problematic), which suggests that we do need _some_ mechanism to
>> use these things across multiple elements, even if it requires custom
>> application code rather than being automatic.
>   One way (specifying element names and properties) or the other
> (specifying caps and presets), it's going to require some custom work to
> be done.
>   The reason I prefer/recommend going the caps/presets way is that most
> of the work will be done *in* the element and preset, and much less (if
> not none) in the profiles and applications.
>>>> This describes the constraints on the device (or whatever). Have you
>>>> thought at all about splitting out "constraints on what the target can
>>>> accept" from "what we actually want to encode"?
>>>> e.g. this profile says that I can do any size (within that range)
>>>> video, but my application wants to encode at a particular size -
>>>> should I be replacing the caps in the profile at runtime, or should
>>>> there be another object to represent these (somewhat different)
>>>> concepts?
>>>  I guess I should have put another example, where the target is a less
>>> 'flexible' device.
>>>  Let's say your target is a portable device that only support
>>> 320x240 at 25fps, then the profile would have those very specific caps in
>>> the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
>>>  Do you have any more specific example in mind with the above
>>> use-case ? Did you mean you wanted the application to be able to
>>> fine-tune even more the profile at runtime ?
>> Yeah - I want the app to fine tune.
>> Let's suppose the following use-case:
>>  - User has an input video they got from the internet somewhere. It's
>> 640x480, 30 fps, theora.
>>  - User has a mobile phone that can play H.264 video at up to 720p (so
>> it can do the video at this resolution).
>>  - User wants to encode it at 320x240 to fit on their little micro-sd card.
>> So the video is already a supported size - but the application wants
>> to scale it smaller because the user has chosen that option - I don't
>> quite understand how that's meant to be expressed in your profiles/API
>> right now (I might be missing something).
>   In that case, your target for that device would have a restriction
> caps along the looks of :
>    video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1,
> 1000/1]
>   Meaning that your device can playback any videos between 16x16 and
> 1280x720.
>   The example I gave before with 320x240 at 25 would be the case for
> devices that can only playback one and only resolution/fps.
>>>> What about constraints that are not (currently, at least) expressible
>>>> through caps? e.g. bitrate, profiles, etc?
>>>  Those are tunable through the presets (through which all properties
>>> are expressed), in the N900 example above, it is set to baseline profile
>>> and the bitrate corresponding to "Quality High".
>> But the presets specify a particular set of settings, not the target
>> constraints. So there's no way to say "this device supports bitrates
>> up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
>> I think?
>> I don't think the element presets are particularly helpful here - they
>> express "a particular configuration" not "a range of possibilities".
>    Are you saying presets don't satisfy all the requirements here ? I
> completely agree. But apart from trying to extend presets there's only
> one ugly other option:
>   * make sure all properties (like bitrate) for encoders have the
> SAME/EXACT/GUARANTEED name and meaning and range ...
>  ... and that one doesn't seem trivial either.
>         Edward
>> Mike
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay 
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> gstreamer-devel mailing list
> gstreamer-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel

More information about the gstreamer-devel mailing list