[HarfBuzz] please remove U+115F U+1160 from default_ignorable
khaledhosny at eglug.org
Tue Mar 19 03:29:27 PDT 2013
On Tue, Mar 19, 2013 at 10:29:28AM +0900, Dohyun Kim wrote:
> 2013/3/19 Konstantin Ritt <ritt.ks at gmail.com>:
> > IIUC, a correct solution would be determining Hangul clusters and then
> > setting the "accumulated" cluster advance to the cluster base prior to
> > hiding default ignorables. Correct me if I'm wrong.
> Well, this solution will give us corrent output only in its appearance.
> If "hiding default ignorable" means substituting by zero-width space,
> then the pdf file produced by xetex would not contain U+115F and
> U+1160, resulting in invalid text string when texts are extracted from
> the pdf.
Many things got lost in PDF output (e.g. Indic reordering), that is not
HarfBuzz fault (as long as the output is visually correct), but rather a
XeTeX deficiency partly due to the awkward state of text extraction
from PDF files.
More information about the HarfBuzz