[HarfBuzz] Using Harfbuzz & ICU & Glyphy in OpenES2 Engine

Fri Feb 7 23:01:49 CET 2014

Hey guys! I'm working on a Python3-based OpenGL library and I want to 
use the union of Harfbuzz and libicu to do some (a small subset) of what 
Pango currently does. I cannot, unfortunately, simply use Pango, as its 
dependency chain will likely stop my clients' interest dead in its 
tracks. Further, there are some issues when customizing Pango's 
rendering model (and I'm not speaking without experience here, I've done 
it when I wrote osgPango) that aren't ideal to work with.

So, I would like to use hb and icu to accomplish some very basic layout 
in my OpenGL scenes (note: I will also be using Glyphy instead of my own 
rasterized SDF font textures too). I will need basic left|right|center 
alignment (potentitally justify, but it isn't a request yet), some basic 
markup support, etc.

I nearly have everything working locally, but the one area I'm having a 
lot of trouble with is knowing how--and in fact, WHOSE responsibility it 
is--to determine how to break large lines of text given my sizing 
constraints. This doesn't appear to be something harfbuzz attempts to 
do, but it may have helper functions nonetheless.

What I'm looking for are a few hints on how people (Behdad?) might 
tackle this problem. ALL text in this library gets converted in 
whitespace-normalized UTF8 using libicu and line breaks can only be 
FORCED by using <p/> or <br/> markup. Otherwise, all breaks should 
behave similarly to how they do in HTML.

If I feed my force-delimited lines of text (that is, break my input feed 
up by <br>/<p>) one huge chunk at a time to harfbuzz, I can get the 
extents for each glyph as if I had unlimited X coordinate space. I can 
use these extents to position as required by the calling function, but 
again, I'm having trouble determining where it is safe to break.

Is this something libicu can handle? Can harfbuzz make it easier?

NOTE: When I "whitespace normalize" my string before ever passing it 
harfbuzz, I use the UBRK API of libicu. There are also functions like:

     u_isspace()

...which look promising, but it expects a UChar32, which I do not know 
how to easily fetch while using harfbuzz's UTF8 functions, who operate 
on potential multibyte chars.

Thanks in advance!