<div dir="ltr"><div>Hi Ebrahim,<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Feb 24, 2018 at 11:33 AM, Ebrahim Byagowi <span dir="ltr"><<a href="mailto:ebraminio@gmail.com" target="_blank">ebraminio@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">About why "isn't", I guess harfbuzz has developed before DirectWrite,</div></blockquote><div><br></div><div>That's not the reason. Uniscribe API also had the separation. Initially I had wanted to allow it, but eventually didn't. Read on.<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"> but I like to know if a separate API for substitution and positioning a possibility? Or, is accepting glyphs instead on input [and later as an optimization, hb_shape without positioning] a possibility? Have a look at hb-directwrite's <a href="https://github.com/harfbuzz/harfbuzz/blob/21646cc4a6160088933774e179df9be4865a9f4b/src/hb-directwrite.cc#L672" target="_blank">GetGlyphs</a> and<wbr> <a href="https://github.com/harfbuzz/harfbuzz/blob/21646cc4a6160088933774e179df9be4865a9f4b/src/hb-directwrite.cc#L717" target="_blank">GetGlyphPlacements</a>.</div></blockquote><div><br></div><div>If you look at the Uniscribe APIs for Shape & Place:<br><br><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">HRESULT ScriptShapeOpenType(
  _In_opt_       HDC                  hdc,
  _Inout_        SCRIPT_CACHE         *psc,
  _Inout_        SCRIPT_ANALYSIS      *psa,
  _In_           OPENTYPE_TAG         tagScript,
  _In_           OPENTYPE_TAG         tagLangSys,
  _In_opt_       <span style="color:blue">int</span>                  *rcRangeChars,
  _In_opt_       TEXTRANGE_PROPERTIES **rpRangeProperties,
  _In_           <span style="color:blue">int</span>                  cRanges,
  _In_     <span style="color:blue">const</span> WCHAR                *pwcChars,
  _In_           <span style="color:blue">int</span>                  cChars,
  _In_           <span style="color:blue">int</span>                  cMaxGlyphs,
  _Out_          WORD                 *pwLogClust,
  _Out_          SCRIPT_CHARPROP      *pCharProps,
  _Out_          WORD                 *pwOutGlyphs,
  _Out_          SCRIPT_GLYPHPROP     *pOutGlyphProps,
  _Out_          <span style="color:blue">int</span>                  *pcGlyphs
);
</pre><br><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial">HRESULT ScriptPlaceOpenType(
  _In_opt_        HDC                  hdc,
  _Inout_         SCRIPT_CACHE         *psc,
  _Inout_         SCRIPT_ANALYSIS      *psa,
  _In_            OPENTYPE_TAG         tagScript,
  _In_            OPENTYPE_TAG         tagLangSys,
  _In_opt_        <span style="color:blue">int</span>                  *rcRangeChars,
  _In_opt_        TEXTRANGE_PROPERTIES **rpRangeProperties,
  _In_            <span style="color:blue">int</span>                  cRanges,
  _In_      <span style="color:blue">const</span> WCHAR                *pwcChars,
  _In_            WORD                 *pwLogClust,
  _In_            SCRIPT_CHARPROP      *pCharProps,
  _In_            <span style="color:blue">int</span>                  cChars,
  _In_      <span style="color:blue">const</span> WORD                 *pwGlyphs,
  _In_      <span style="color:blue">const</span> SCRIPT_GLYPHPROP     *pGlyphProps,
  _In_            <span style="color:blue">int</span>                  cGlyphs,
  _Out_           <span style="color:blue">int</span>                  *piAdvance,
  _Out_           GOFFSET              *pGoffset,
  _Out_opt_       ABC                  *pABC
);<br><span style="font-family:arial,helvetica,sans-serif"><br>Two things stand out:<br><br>  - There's a lot of duplicate info going into both calls,<br></span></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><span style="font-family:arial,helvetica,sans-serif">  - There's also a lot data coming out of the first call just to go directly into the second; namely </span>pCharProps<font face="arial,helvetica,sans-serif"> and </font>pGlyphProps.<br><br></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">Those two very strongly suggest that the two calls are part of the same larger operation and rather forcefully separated.<br><br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">We can do the same separation in HarfBuzz. We also have lots of data that should come out of the first call and go into the second call to make that possible.  Some of that even matches the data Uniscribe is passing.  In our case, to reconstruct the buffer in the second call we need the following buffer-internal info:<br><br>/* buffer var allocations, used during the entire shaping process */ <br>#define unicode_props()↦↦       var2.u16[0] <br> <br>/* buffer var allocations, used during the GSUB/GPOS processing */ <br>#define glyph_props()↦  ↦       var1.u16[0] /* GDEF glyph properties */ <br>#define lig_props()↦    ↦       var1.u8[2] /* GSUB/GPOS ligature tracking */ <br>#define syllable()↦     ↦       var1.u8[3] /* GSUB/GPOS shaping boundaries */ <br><br><br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">The syllable() is only used during shaping; so that's not needed for positioning.  The lig_props is needed to correctly attach marks to their ligature components. Uniscribe should be hiding that info somewhere in those Reserved bits it passes.  Looks like we need 40 bits per glyph to be passed between the two calls to make this possible without significant restructuring.<br><br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">I mean, sure, I can split hb_ot_shape() into two calls as long as you take the buffer from first and pass it straight to the second. But to funnel that buffer through the Uniscribe API boundary, we need to pass those 40 bits somewhere in the Uniscribe structs:<br><br></font><font face="arial,helvetica,sans-serif"><span style="color:blue">typedef</span> <span style="color:blue">struct</span> script_charprop {
  WORD fCanGlyphAlone  :1;
  WORD reserved  :15;
} SCRIPT_CHARPROP;
<br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">This one is per character, while we work mostly per glyph. So might be useful or not. Interesting how the fCanGlyphAlone is similar to our unsafe_to_break, but modeled differently.<br><br></font><font face="arial,helvetica,sans-serif"><span style="color:blue">typedef</span> <span style="color:blue">struct</span> script_glyphprop {
  SCRIPT_VISATTR sva;
  WORD           reserved;
} SCRIPT_GLYPHPROP;
<br class="gmail-Apple-interchange-newline"></font><font face="arial,helvetica,sans-serif"><span style="color:blue">typedef</span> <span style="color:blue">struct</span> tag_SCRIPT_VISATTR {
  WORD uJustification  :4;
  WORD fClusterStart  :1;
  WORD fDiacritic  :1;
  WORD fZeroWidth  :1;
  WORD fReserved  :1;
  WORD fShapeReserved  :8;
} SCRIPT_VISATTR;
<br class="gmail-Apple-interchange-newline"></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">The SCRIPT_GLYPHPROP is unique to the OpenType() flavor of the Uniscribe calls. The vanilla versions just have SCRIPT_VISATTR.  It looks hard, if not impossible, to pass all the data we want through those. This is in part because HarfBuzz does more work based on Unicode property of input characters than Uniscribe does.  For example, we do fallback positioning when font lacks GPOS. For that, we need to carry the (modified) combining class of the characters around, not just "fDiacritic".<br><br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">Separating the calls also means that some things, like which OpenType feature applies to what range, needs to be recalculated. Guess that's not a huge deal.  The biggest problem with separating the calls in a way that is useful for Wine implementing the Uniscribe API on top is that we have to expose the buffer-internal bit allocations. And we don't want to do that, because that is an implementation detail and changes over time.<br><br></font></pre><pre class="gmail-" style="padding:5px;margin:0px;font-style:normal;font-weight:400;overflow:auto;font-family:Consolas,Courier,monospace;white-space:pre-wrap;color:rgb(0,0,0);font-size:14px;font-variant-ligatures:normal;font-variant-caps:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;text-decoration-style:initial;text-decoration-color:initial"><font face="arial,helvetica,sans-serif">Anyway, that's the gist of it.  Hope this helps.<br></font></pre></div></div><span style="font-family:arial,helvetica,sans-serif"></span><br>-- <br><div class="gmail-m_-9100122635912255931gmail_signature">behdad<br><a href="http://behdad.org/" target="_blank">http://behdad.org/</a></div>
</div></div>