[gst-devel] [RFC] Encoding and Profiles

Michael Smith msmith at xiph.org
Mon Oct 19 21:23:17 CEST 2009

>> It'd probably also be good to have some way to select, and then set
>> properties on, the elements used here. e.g. the application probably
>> wants to be able to control what sort of scaling to do (to enable
>> high-quality scaling, for example, or low-quality/fast for preview
>> encodes). Obviously, the default would just work, so this would be
>> more optional API for more advanced applications.
>  That's the problem with trying to design the
> one-API-to-rule-them-all :)
>  Seriously though... the problem is that if we expose everything... we
> just come back to square one (or maybe two, but not that far ahead).
>  The other problem is also that we might have several 'converters'
> available, none of them having well-known properties. Maybe some
> platform might have a differently named converter, ...
>  An intermediate solution might be to provide a quality/speed knob over
> those conversions.
>  Maybe have it as a boolean. So by default you would get the highest
> quality of conversion available (if you're in a live pipeline, QoS would
> kick in to lower the quality so you don't lose any data), but if you
> flip that boolean, you would get a low-quality/as-fast-as-possible
> conversion.

I was basically just thinking of something along the lines of what
playbin does - by default, just do something sane (here, videoscale
with the scaling mode set to the highest-quality mode), but also allow
the app to just say "use this element instead" - the app can then
override things as much as it wants to, but it doesn't _have to_,
since the defaults work sensibly.

>> >
>> > (2) Actual encoding (optional for raw streams)
>> >
>> >  An encoder (with some optional settings) is used.
>> Are you planning anything for specifying how the settings should work,
>> such that a profile could contain settings that apply to several
>> different encoders (probably selected by rank, or optionally forced by
>> the application), or will the settings be tied to a specific element?
>  Right now the settings would be tied to a specific element, since the
> profile system relies exclusively on presets (which are tied to an
> element) for properties of an element.
>  The rationale behind this... is that we have no unified properties
> across elements, let alone across encoders, let alone across different
> encoders for the same format.
>  I'd *LOVE* to have a unified system for properties which are common to
> encoders... but every time I put my head down on that problem.. I only
> see one solution : base classes for encoders (guaranteeing *some*
> consistency in properties).

Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
the profile I use will have something to do with the user's
configuration and the device we're transcoding for - but the actual
elements available to satisfy that profile will be different across
platforms and depend on what things the user has installed.

I can't see myself using this system if it's tightly tied to specific
elements for encoders (parsers, muxers, decoders, scalers, etc are
less problematic), which suggests that we do need _some_ mechanism to
use these things across multiple elements, even if it requires custom
application code rather than being automatic.

>> This describes the constraints on the device (or whatever). Have you
>> thought at all about splitting out "constraints on what the target can
>> accept" from "what we actually want to encode"?
>> e.g. this profile says that I can do any size (within that range)
>> video, but my application wants to encode at a particular size -
>> should I be replacing the caps in the profile at runtime, or should
>> there be another object to represent these (somewhat different)
>> concepts?
>  I guess I should have put another example, where the target is a less
> 'flexible' device.
>  Let's say your target is a portable device that only support
> 320x240 at 25fps, then the profile would have those very specific caps in
> the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
>  Do you have any more specific example in mind with the above
> use-case ? Did you mean you wanted the application to be able to
> fine-tune even more the profile at runtime ?

Yeah - I want the app to fine tune.

Let's suppose the following use-case:
 - User has an input video they got from the internet somewhere. It's
640x480, 30 fps, theora.
 - User has a mobile phone that can play H.264 video at up to 720p (so
it can do the video at this resolution).
 - User wants to encode it at 320x240 to fit on their little micro-sd card.

So the video is already a supported size - but the application wants
to scale it smaller because the user has chosen that option - I don't
quite understand how that's meant to be expressed in your profiles/API
right now (I might be missing something).

>> What about constraints that are not (currently, at least) expressible
>> through caps? e.g. bitrate, profiles, etc?
>  Those are tunable through the presets (through which all properties
> are expressed), in the N900 example above, it is set to baseline profile
> and the bitrate corresponding to "Quality High".

But the presets specify a particular set of settings, not the target
constraints. So there's no way to say "this device supports bitrates
up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
I think?

I don't think the element presets are particularly helpful here - they
express "a particular configuration" not "a range of possibilities".


More information about the gstreamer-devel mailing list