[HarfBuzz] Fwd: harfbuzz work
Jonathan Kew
jonathan at jfkew.plus.com
Wed Jul 15 10:30:00 PDT 2009
This was originally written to Behdad, but copying the HB mailing list
as it may be of interest to others. Feedback welcome. :)
JK
Begin forwarded message:
> From: Jonathan Kew <jonathan at jfkew.plus.com>
> Date: 24 June 2009 19:18:33 BST
> To: Behdad Esfahbod <behdad at behdad.org>
> Subject: harfbuzz work
>
> Hi Behdad,
>
> FYI, I'm attaching some experimentation I've been doing with
> HarfBuzz. This is based on your harfbuzz-ng from *before* the most
> recent commit ("XX") to that branch, as it appeared to be in a
> somewhat broken (or should I say partially-updated) state there.
>
> The zip file contains new stuff I've been writing, working towards a
> HarfBuzz-based module we could use in Gecko, without relying on
> anything else in Pango. There are also a few modifications to your
> code in pango/opentype, attached as a separate diff file.
>
> What I've done here - some of which you may want to take into
> HarfBuzz itself, unless you already have better solutions:
>
> * Alternate layout constructor taking pointers to the OpenType
> tables; I'm using this on OS X at the moment as it's the most
> convenient way to provide the font data. We won't always have an
> actual file available for the mmap() approach, though of course
> that's ideal when we can use it.
>
> * In hb-buffer, made hb_buffer_ensure() public as it could be useful
> for client code to preallocate space, if it knows how much text is
> coming; also gave hb_buffer_new() a size parameter so that the
> caller can ask for an initial allocation size.
>
> * More importantly, I think hb_buffer_ensure() had a bug in the case
> where out_string == in_string; it was realloc'ing in_string before
> checking whether the pointers were the same, which means the
> in_string pointer is likely to have been changed and the wrong
> branch will be chosen. I think this is fixed correctly in the
> attached patch.
>
> * Provided a small HB-friendly cmap-reader (currently handles
> formats 4 and 12 only).
>
> * A script-run itemizer based on ICU's, but adapted to support text
> in any of UTF-8, 16, or 32 (not actually tested with them all yet,
> though).
>
> * Code to look up the Unicode character properties we're likely to
> need; currently script, bidi direction, and arabic joining type.
> This can be retrieved from the ICU property APIs, if the client is
> using ICU anyway, or there's a local implementation supporting just
> the properties needed in the layout process. Actually, as we don't
> do bidi within HarfBuzz, I'm not sure we need that property; on the
> other hand, we may need character types (combining marks, etc) for
> cluster handling - I haven't looked into that yet.
>
> * Proposed shaping-function API (see hb-shaper.h) and two shaper
> implementations (generic and arabic/syriac/n'ko). These support user-
> specified features in addition to the defaults and script-specific
> shaping features. Oh, they also handle mirroring using the OMPL
> table, and apply ltra/rtla etc according to direction.
>
> In the shaper API that I'm using right now, the approach is to
> initially fill the buffer with *character* codes, and the shaper
> function takes a pointer to a cmap table in addition to the layout
> record. I did this because shaping needs access to the Unicode
> values, not just the glyphs. I suppose we could specify that the
> cmap table can be NULL, in which case the buffer is assumed to
> contain glyph IDs already, but this will make most complex-script
> shaping impossible. (Actually, it's a problem even for the generic
> shaper, as it needs the Unicode character codes for mirroring.)
>
> Assuming we use this model of making the shaper be responsible for
> mapping Unicode to glyphs, should the cmap table be incorporated
> into the layout record just like GDEF/GSUB/GPOS? I did it separately
> for now just to minimize disruption to your opentype files, but
> there's not much reason to keep it separate IMO.
>
> One outstanding issue is passing parameters to features like
> 'aalt' (alternate substitution lookups). I see you have a
> "placeholder" for a callback function in
> AlternateSubstFormat1::apply, but this doesn't look quite sufficient
> AFAICT. In order to return the proper index, the function would need
> to know which feature is currently being processed, which is
> information that is not available at this level of applying the
> lookup. (Note that it would be possible for a run of text to have
> several Alternate features applied, with different indexes used for
> each of them.)
>
> I'm wondering whether it would be feasible to use the "mask"
> parameter to hb_ot_layout_{substitute,position}_lookup to help here.
> This is used to selectively switch lookups off for certain glyphs in
> the buffer, in order to implement things like Arabic shaping, but if
> we could assume that the shapers should never need more than 24 bits
> for this purpose (will a shaper ever need individual control of 24
> distinct features or sets of features?), then we could also use the
> low byte of the mask to pass a "feature argument" through to the
> lookups. Currently, the mask is not passed all the way down to the
> individual subtable apply() functions, so this would need to be
> done, but I don't think that would be hard, and it would allow a
> specific alternate index associated with a feature to be passed on
> to that feature's lookup(s) and used to choose the right alternate.
> What do you think - should I give this a try and see how it works in
> practice?
>
> Regards,
>
> Jonathan
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: harfbuzz-changes.diff
Type: application/octet-stream
Size: 6171 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20090715/59732370/attachment.obj>
-------------- next part --------------
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hb_test.zip
Type: application/zip
Size: 59688 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/harfbuzz/attachments/20090715/59732370/attachment.zip>
-------------- next part --------------
More information about the HarfBuzz
mailing list