[gst-devel] Text rendering

Sat Feb 19 15:02:23 CET 2005

Dnia 16-02-2005, śro o godzinie 16:47 +0100, Gergely Nagy napisał:
> A few months ago, in last December, David Schleef wrote about the future
> of textoverlay
> (http://article.gmane.org/gmane.comp.video.gstreamer.devel/12264), which
> got me thinking.

[snip]

> Anyway, for fancy effects (think not only in colors, but textures too),
> the approach outlined here will work fine, I guess.

In a word, no. It's not gonna work for subtitles at all (even simple
colouring of different lines in different colours will be difficult, and
individually coloured letters will be downright impossible). That's
because whilst presented model is rather flexible when it works, it's
also extremely static (you'd need to replug pipeline *each* time you
wanted to have differently coloured line added, yuck!), and makes
complex things even more complex, losing lots of info in the process.

What I want to see is *one* (in practice that's all that should be
needed, but we can make it subclassable for really special needs)
element supporting range of basic, but rich, and most importantly,
combinable ops from which we can build effects. IOW, we want (cairo)
canvas, and a protocol for manipulating its objects, which various
parsers and other elements that want to generate text will use.

Mandatory ASCII-art:

+----------+                 +-----------+            +-----------+
|          | application/    |   Text    |  video/    |           |
| Subtitle +-----------------+ Renderer  +------------+   Image   |
|  Parser  | x-gst-subtitles | (cairo    |  x-raw-rgb |   mixer   |
|          |                 | overlay)  |  (ARGB)    |           |
+----------+                 +-----------+            +-----------+
                                                          /
                                                         /
                                                        /
                                                       /
+----------+                                          /
|          |                                         /
|  Video   |     video/x-raw-rgb                    /
| renderer +---------------------------------------/
|          |
+----------+

It outputs into imagemixer instead of blitting directly to video because
Gergely wants to have output in non-blitted way to have fun with his toy
elements afterwards :). This way we can keep most of his proposal, that
is apply separate *video* effects, whilst having robust *text*
rendering.

Now, application/x-gst-subtitles is a protocol that would support:

- creating objects with unique ID
- manipulating (move, resize, rotate, colourise, maybe arbitrary
transformation) objects with given ID
- rendering (it should be operation separate from creation) objects
- destroying objects

This is the only way where we can have rich[1] text rendering support,
whilst keeping it sane. I've come to this conclusion after studying
number of subtitles formats, all of which are fundamentally broken --
they all assume that creation, transformation and display of text object
is single step, which means they are limited to operations and
combinations thereof which were anticipated -- in SSA/ASS format, you
can't for example make text change size and move along path at once,
because they are separate markup tags which can't be combined. You can
hack around it (by creating and quickly destroying gazillion of text
objects which all have the same text, but are scaled and displayed in
slightly different position to fake movement along the path), but it's
gross, and also limited. I want model that works sanely every time.

Cheers,
Maciej

[1] And by rich, I mean supporting all the operations found in various
subtitles formats without creating separate renderer for each of them

-- 
Maciej Katafiasz <ml at mathrick.org>