[HarfBuzz] Carifying graphmemes, HB-cluster, carets and string manipulation

Diederick Huijbers diederickh at gmail.com
Fri Jan 23 13:45:19 PST 2015


Hi,

Recently I started working on a text input field for openGL that uses
freetype
for rasterising and harfbuzz for shaping. Currently I'm trying to figure
out how
to find the position of the caret based on the current graphmeme and how
string
manipulation is supposed to work in relation to a textfield and
graphmemes/glyphs.

*Caret position*
In my previous post on this list someone pointed me to this url which
describes
something similar. For clarity I'll add the link here:

        https://codereview.chromium.org/130433006#msg8

If I'm correct, the `cluster` member of the `hb_glyph_info_t` struct tells
me
the bytes offset where the glyph starts, or at least that's my
understanding. For
example when I have the following string, looping over the found glyph
infos gives me:

*String:* 綧緁緅 襏襆贂 峷敊浭


HB: 00: codepoint: 7b11 cluster: 00, advance_x: 48, numbytes: 29
HB: 01: codepoint: 7b49 cluster: 03, advance_x: 48, numbytes: 29
HB: 02: codepoint: 7b54 cluster: 06, advance_x: 48, numbytes: 29
HB: 03: codepoint: 0001 cluster: 09, advance_x: 10, numbytes: 29
HB: 04: codepoint: 924a cluster: 10, advance_x: 48, numbytes: 29
HB: 05: codepoint: 923c cluster: 13, advance_x: 48, numbytes: 29
HB: 06: codepoint: 98db cluster: 16, advance_x: 48, numbytes: 29
HB: 07: codepoint: 0001 cluster: 19, advance_x: 10, numbytes: 29
HB: 08: codepoint: 40cc cluster: 20, advance_x: 48, numbytes: 29
HB: 09: codepoint: 4f6d cluster: 23, advance_x: 48, numbytes: 29
HB: 10: codepoint: 5d63 cluster: 26, advance_x: 48, numbytes: 29


When I iterate over the same string using ICU, with the code pasted in the
link below, I get:

Boundary at position: 0
Boundary at position: 1
Boundary at position: 2
Boundary at position: 3
Boundary at position: 4
Boundary at position: 5
Boundary at position: 6
Boundary at position: 7
Boundary at position: 8
Boundary at position: 9
Boundary at position: 10
Boundary at position: 11

https://gist.github.com/roxlu/7d73f8928e7e8489ae65


This seems to be a 1:1 match, but my biggest question is how I can map the
ICU boundaries
to the correct HB-buffer/clusters?

*String manipulation:*
When I want the user to manipulate the text inside the input field, with
e.g. delete
and backspace keys, should I manipulate the graphmemes? or the UTF-8
codepoints?
or maybe something else?

Thanks
d
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20150123/fe0fd26c/attachment.html>


More information about the HarfBuzz mailing list