[HarfBuzz] Why harfbuzz isn't/couldn't/shouldn't provide separate [optional] API for glyph/positioning?

Behdad Esfahbod behdad at behdad.org
Mon Feb 26 02:28:49 UTC 2018


Hi Ebrahim,

On Sat, Feb 24, 2018 at 11:33 AM, Ebrahim Byagowi <ebraminio at gmail.com>
wrote:

> About why "isn't", I guess harfbuzz has developed before DirectWrite,
>

That's not the reason. Uniscribe API also had the separation. Initially I
had wanted to allow it, but eventually didn't. Read on.



> but I like to know if a separate API for substitution and positioning a
> possibility? Or, is accepting glyphs instead on input [and later as
> an optimization, hb_shape without positioning] a possibility? Have a look
> at hb-directwrite's GetGlyphs
> <https://github.com/harfbuzz/harfbuzz/blob/21646cc4a6160088933774e179df9be4865a9f4b/src/hb-directwrite.cc#L672>
>  and GetGlyphPlacements
> <https://github.com/harfbuzz/harfbuzz/blob/21646cc4a6160088933774e179df9be4865a9f4b/src/hb-directwrite.cc#L717>
> .
>

If you look at the Uniscribe APIs for Shape & Place:

HRESULT ScriptShapeOpenType(
  _In_opt_       HDC                  hdc,
  _Inout_        SCRIPT_CACHE         *psc,
  _Inout_        SCRIPT_ANALYSIS      *psa,
  _In_           OPENTYPE_TAG         tagScript,
  _In_           OPENTYPE_TAG         tagLangSys,
  _In_opt_       int                  *rcRangeChars,
  _In_opt_       TEXTRANGE_PROPERTIES **rpRangeProperties,
  _In_           int                  cRanges,
  _In_     const WCHAR                *pwcChars,
  _In_           int                  cChars,
  _In_           int                  cMaxGlyphs,
  _Out_          WORD                 *pwLogClust,
  _Out_          SCRIPT_CHARPROP      *pCharProps,
  _Out_          WORD                 *pwOutGlyphs,
  _Out_          SCRIPT_GLYPHPROP     *pOutGlyphProps,
  _Out_          int                  *pcGlyphs
);


HRESULT ScriptPlaceOpenType(
  _In_opt_        HDC                  hdc,
  _Inout_         SCRIPT_CACHE         *psc,
  _Inout_         SCRIPT_ANALYSIS      *psa,
  _In_            OPENTYPE_TAG         tagScript,
  _In_            OPENTYPE_TAG         tagLangSys,
  _In_opt_        int                  *rcRangeChars,
  _In_opt_        TEXTRANGE_PROPERTIES **rpRangeProperties,
  _In_            int                  cRanges,
  _In_      const WCHAR                *pwcChars,
  _In_            WORD                 *pwLogClust,
  _In_            SCRIPT_CHARPROP      *pCharProps,
  _In_            int                  cChars,
  _In_      const WORD                 *pwGlyphs,
  _In_      const SCRIPT_GLYPHPROP     *pGlyphProps,
  _In_            int                  cGlyphs,
  _Out_           int                  *piAdvance,
  _Out_           GOFFSET              *pGoffset,
  _Out_opt_       ABC                  *pABC
);

Two things stand out:

  - There's a lot of duplicate info going into both calls,

  - There's also a lot data coming out of the first call just to go
directly into the second; namely pCharProps and pGlyphProps.

Those two very strongly suggest that the two calls are part of the
same larger operation and rather forcefully separated.

We can do the same separation in HarfBuzz. We also have lots of data
that should come out of the first call and go into the second call to
make that possible.  Some of that even matches the data Uniscribe is
passing.  In our case, to reconstruct the buffer in the second call we
need the following buffer-internal info:

/* buffer var allocations, used during the entire shaping process */
#define unicode_props()↦↦       var2.u16[0]

/* buffer var allocations, used during the GSUB/GPOS processing */
#define glyph_props()↦  ↦       var1.u16[0] /* GDEF glyph properties */
#define lig_props()↦    ↦       var1.u8[2] /* GSUB/GPOS ligature tracking */
#define syllable()↦     ↦       var1.u8[3] /* GSUB/GPOS shaping boundaries */


The syllable() is only used during shaping; so that's not needed for
positioning.  The lig_props is needed to correctly attach marks to
their ligature components. Uniscribe should be hiding that info
somewhere in those Reserved bits it passes.  Looks like we need 40
bits per glyph to be passed between the two calls to make this
possible without significant restructuring.

I mean, sure, I can split hb_ot_shape() into two calls as long as you
take the buffer from first and pass it straight to the second. But to
funnel that buffer through the Uniscribe API boundary, we need to pass
those 40 bits somewhere in the Uniscribe structs:

typedef struct script_charprop {
  WORD fCanGlyphAlone  :1;
  WORD reserved  :15;
} SCRIPT_CHARPROP;

This one is per character, while we work mostly per glyph. So might be
useful or not. Interesting how the fCanGlyphAlone is similar to our
unsafe_to_break, but modeled differently.

typedef struct script_glyphprop {
  SCRIPT_VISATTR sva;
  WORD           reserved;
} SCRIPT_GLYPHPROP;
typedef struct tag_SCRIPT_VISATTR {
  WORD uJustification  :4;
  WORD fClusterStart  :1;
  WORD fDiacritic  :1;
  WORD fZeroWidth  :1;
  WORD fReserved  :1;
  WORD fShapeReserved  :8;
} SCRIPT_VISATTR;

The SCRIPT_GLYPHPROP is unique to the OpenType() flavor of the
Uniscribe calls. The vanilla versions just have SCRIPT_VISATTR.  It
looks hard, if not impossible, to pass all the data we want through
those. This is in part because HarfBuzz does more work based on
Unicode property of input characters than Uniscribe does.  For
example, we do fallback positioning when font lacks GPOS. For that, we
need to carry the (modified) combining class of the characters around,
not just "fDiacritic".

Separating the calls also means that some things, like which OpenType
feature applies to what range, needs to be recalculated. Guess that's
not a huge deal.  The biggest problem with separating the calls in a
way that is useful for Wine implementing the Uniscribe API on top is
that we have to expose the buffer-internal bit allocations. And we
don't want to do that, because that is an implementation detail and
changes over time.

Anyway, that's the gist of it.  Hope this helps.


-- 
behdad
http://behdad.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/harfbuzz/attachments/20180225/aadda921/attachment-0001.html>


More information about the HarfBuzz mailing list