[gst-devel] mimetypes/caps

Sat Jun 21 09:24:13 CEST 2003

Hey all,

I've been working on getting mimetypes right lately. Starting points are
IANAs list of known mimetypes (goes mostly for files/streams), and the
rest is mostly a list of names we made up ourselves with x- in front of
it. The current list of how I propose to name our mimetypes + properties
is in CVS in gstreamer/docs/random/mimetypes. Since SF's webCVSview
doesn't update currently, I can't give a direct link. Checkout CVS (if
you don't have HEAD anywhere, you probably won't bother anyway). 

Some remarkable things people will want to know:
* audio/raw and video/raw don't exist anymore. They're far too broad.
int audio, float audio, ADPCM audio etc. etc. are all raw, but they're
not actually similar in any way. Same goes for RGB/YUV video.
* video/avi doesn't exist either. Codecs actually all have a name now.
If not, it needs adding. Usage of video/avi + fourcc (in HEAD) as a way
of identifying a codec will be punished in the most horrible way that we
can collectively think of.
* We're trying to separate muxing/parsing format from the actual codec
as much as possible. This goes for Ogg/Vorbis (application/ogg vs.
audio/x-vorbis) and a lot more. I propose to keep it this way. This
might not work for all formats (mp3, flac, ...), but that shouldn't stop
us from doing it that way as much as possible.
* RGB 24/32 bpp is *always* in big-endian format. The masks actually
tell us where the bytes are located.
* for each video codec, width/height can be given. For each audio codec,
rate/channels can be given.

Some things that I'm not 100% happy about, but where I can't think of a
better way:
* divx, xvid, mpeg-4 all have different mimetypes. This is wrong, since
a xvid decoder can (sort of...) decode divx, too. Currently, that
doesn't work, obviously, because the mimetypes don't match.
* no information on subsampling for YUV (JPEG/MPEG) is given
* no separation between MJPEG-A/MJPEG-B in Quicktime (or lossless JPEG)
* ADPCM way of identifying between different ways of packing is pretty
much stupid right now. We might want to make an integer of 'how many
samples packed together per channel' or so, but they're actually pretty
much the same if you've got only one channel, afaik. FFMPEG doesn't have
a smart way of doing this either, I don't know...

Comments are appreciated. I'd like to take this as a starting point for
converting all plugins to this new stuff. This is a horrible job and
since it's taken so long already, I'll start doing this ASAP. After
that, I'll make a SGML/XML/whatever document to be included in the
documentation, too.

Ronald

-- 
Ronald Bultje <rbultje at ronald.bitfreak.net>