[HarfBuzz] Setting initial cluster value

Kelvin Ma kelvinsthirteen at gmail.com
Sat Jun 25 20:54:44 UTC 2016


@behdad can I replace the object-chars with some obscure unicode character,
and just have harfbuzz ignore that character for contextual feature
purposes, but preserve the cluster values?

On Sat, Jun 25, 2016 at 4:52 PM, Kelvin Ma <kelvinsthirteen at gmail.com>
wrote:

> It’s only a character in a conceptual sense, the text is really a list of
> chars and objects like
>
> ['t', 'h', 'i', 's', ' ', {FONT_POSITIVE: 'bold'}, 'i', 's', ' ', 'b',
> 'o', 'l', 'd', 'e', 'd', {FONT_NEGATIVE: 'bold'}, ' ', 't', 'e', 'x', 't']
>
> These types of characters are all over a Knockout document; they can
> create fractions, formulas, radicals, page numbers, etc. (They can also be
> nested but let’s not get into that). They can take up horizontal space and
> they get set into the line just like any Harfbuzz glyph output, though the
> fontstyle chars have a width of zero by default (but they can be set to
> have width to do useful stuff like italic correction)
>
> [image: Inline image 1]
>
> ^ See the small pink triangles? Those are the font characters; they can be
> selected, typed, and deleted just like any other character. (The square
> root is also another object-character, but one that takes up horizontal
> space.)
>
> I may be able to replace all the object-chars with spaces to make a big
> paragraph string that could be passed into the shaper on an index basis,
> though the mere presence of a space character would probably ruin the
> cross-run arabic shaping. Alternatively I could just strip the
> object-chars, but that would destroy the cluster values which would make
> editing impossible. 🙃
>
> On Sat, Jun 25, 2016 at 4:27 PM, Behdad Esfahbod <
> behdad.esfahbod at gmail.com> wrote:
>
>> On Jun 25, 2016 12:33 PM, <kelvinsthirteen at gmail.com> wrote:
>> >
>> >
>> >
>> > > On Jun 25, 2016, at 1:39 PM, Khaled Hosny <khaledhosny at eglug.org>
>> wrote:
>> > >
>> > > On Sat, Jun 25, 2016 at 01:06:27PM -0400, Kelvin Ma wrote:
>> > >>>>>> Don’t you
>> > >>>>> need
>> > >>>>>> context to be ignored if the boundaries of the text you want to
>> shape
>> > >>>>> fall
>> > >>>>>> inside a cluster? Like in the string 'af[fluency s]tate' where
>> only
>> > >>> the
>> > >>>>>> 'fluency s' is supposed to be shaped?
>> > >>>>>
>> > >>>>> Depends on why you are shaping “fluency s” alone, if it is
>> because of,
>> > >>>>> say, font change, then you need HarfBuzz to know the context
>> otherwise
>> > >>>>> you get broken Arabic shaping.
>> > >>>>
>> > >>>> Well font change would produce a separate run that wouldn’t know
>> about
>> > >>> the
>> > >>>> other runs so context can only be within a same-direction,
>> same-font run.
>> > >>>
>> > >>> This is wrong, font change shouldn’t break Arabic shaping, so you
>> have
>> > >>> to pass the context even in this case.
>> > >>>
>> > >>
>> > >> If the text consists of text strings separated by formating objects,
>> each
>> > >> text string doesn’t know about what’s around it. Because that’s at a
>> much
>> > >> higher level in the code and harfbuzz can only handle a single font
>> in a
>> > >> single run at a time. To artificially jam in the neighboring runs
>> for each
>> > >> shaping attempt would involve an inordinate amount of string
>> concatenation
>> > >> and searching on the fly.
>> > >
>> > > One can always fix his code to not do wrong assumptions. When doing
>> text
>> > > layout you always need the full paragraph, and you should have it
>> around
>> > > after itemisation. Itemisation does not have to be done by splitting
>> > > text, you can just store run start indices and lengths.
>> >
>> > No, meaning font styling is created by inline styling objects. They’re
>> like inline images except they have zero width. So a font change is really
>> stored as a special character in between the two sections. This character
>> is not understood by harfbuzz, which is why it does not make sense to pass
>> anything containing it into the shaper.
>>
>> That's your design's limitation.  You still can fix it by using custom
>> Unicode funcs with HarfBuzz, that returns a "default-ignorable" Unicode
>> property for your placeholder codepoints.  I just checked and it wouldn't
>> work right now; I'll fix that.  What placeholder character do you use?  Can
>> you change that?
>> >
>> > >
>> > > Regards,
>> > > Khaled
>> > _______________________________________________
>> > HarfBuzz mailing list
>> > HarfBuzz at lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/harfbuzz
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20160625/131b76da/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Selection_049.png
Type: image/png
Size: 17662 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20160625/131b76da/attachment-0001.png>


More information about the HarfBuzz mailing list