[gst-devel] metadata and charsets

Colin Walters walters at debian.org
Wed Feb 12 21:55:04 CET 2003


So,

A while back I raised the issue of GStreamer and character sets on IRC,
and someone on the core team (I think it was omega) thought that
GStreamer shouldn't be doing charset conversions.

I have been thinking about this again recently since I fairly often see
charset warnings from net-rhythmbox when trying to display metadata
recieved from GStreamer.

I think that GStreamer should canonicalize all metadata strings to
UTF-8.  This will greatly simply things for application developers, who
won't have to do this poorly (as many of them will) or not at all (as
even more of them will).  

>From the above standpoint purely of avoiding code duplication, it makes
sense.  But it makes even more sense when you consider dynamic
pipelines; the application might not even know ahead of time what
charset to expect for the metadata!  If I construct a pipeline like:

gnomevfssrc iradio-mode=1 location=http://blaat.com ! spider ! osssink

Now, I could be receiving data in either mp3 or vorbis; I don't know
ahead of time.  From the mp3s I'd get metadata from the id3v2 tags,
which can be in several charsets (like ISO-8859-1, UTF16, UTF-8).  From
vorbis I'd get UTF-8 data.  Especially in the mp3 case, I as an
application developer have no idea what charset to expect.  But since
the metadata is tagged with a charset in the id3v2 tag, GStreamer should
know.  

So I think GStreamer should convert everything to UTF-8.  GStreamer
already uses glib, which has a nice set of functions for this.  Any
opinions?





More information about the gstreamer-devel mailing list