[HarfBuzz] Shaping different scripts in the same text run

Khaled Hosny khaledhosny at eglug.org
Sun Apr 7 04:55:27 PDT 2013


On Sun, Apr 07, 2013 at 02:59:32AM +0200, Lóránt Pintér wrote:
> Hi,  
> 
> I'm struggling with the problem of shaping mixed text. Say I have Thai
> and English text that I would like to shape. If I put all of it in a
> buffer, HarfBuzz chooses a shaper based on the first identifiable
> character, and then uses that shaper for the whole text. So
> "<thai><english>" gets shaped fine with the Thai shaper, but
> "<english><thai>" gets messed up because it is shaped with the default
> shaper.
> 
> I was trying to figure out how Pango does this, but found nothing yet.
> 
> Is it possible to ask HarfBuzz to identify text runs inside a buffer
> (or some other way) that can be shaped with different shapers? If
> there was a call that would identify the script (and maybe writing
> direction) of each character in the input, then I could split the
> buffer at positions where these a different script is used.

You have to split the text runs before passing them to HarfBuzz, etch
run should have the same script/language and text direction.

Ideally text should be first itemized into runs with the same script,
and further split them into directional run according to BiDi algorithm.

There are of course more subtleties involved, like when using multiple
fonts etc.

Regards,
Khaled



More information about the HarfBuzz mailing list