<div dir="ltr">Sorry, no progress so far. But for tracking purposes:<br><a href="https://github.com/harfbuzz/harfbuzz/issues/1011">https://github.com/harfbuzz/harfbuzz/issues/1011</a><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Jan 20, 2018 at 6:22 PM, Eric Muller <span dir="ltr"><<a href="mailto:emuller@amazon.com" target="_blank">emuller@amazon.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_-3592440659313515916moz-cite-prefix"><span class="">
<blockquote type="cite">The easiest would be to add a new API
analogous to hb_ot_font_set_funcs(), that does NOT have the
symbol shift in it</blockquote></span>
That works.<br>
<br>
Thanks,<br>
Eric.<div><div class="h5"><br>
<br>
<br>
On 1/19/18 4:43 PM, Behdad Esfahbod wrote:<br>
</div></div></div><div><div class="h5">
<blockquote type="cite">
<div dir="ltr">
<div>
<div>Ok, let's see how we can address this...<br>
<br>
</div>
I don't like a setting on the buffer as currently the
get_glyph() callback has no way of accessing that
information. The easiest would be to add a new API analogous
to hb_ot_font_set_funcs(), that does NOT have the symbol shift
in it. It's not the most elegant solution but easiest. Would
that work for you?<br>
<br>
</div>
That said, this issue is also related, as it pertains another
non-Unicode encoding, though, in the font not the buffer:<br>
<br>
<a href="https://github.com/harfbuzz/harfbuzz/issues/681" target="_blank">https://github.com/harfbuzz/<wbr>harfbuzz/issues/681</a><br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Jan 18, 2018 at 11:27 PM, Eric
Muller <span dir="ltr"><<a href="mailto:emuller@amazon.com" target="_blank">emuller@amazon.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_-3592440659313515916m_4762254172372698478moz-cite-prefix">I want
to build a rendering system where U+0041 renders as an
"A", regardless of the selected font.<span class="m_-3592440659313515916HOEnZb"><font color="#888888"><br>
<br>
Eric.</font></span>
<div>
<div class="m_-3592440659313515916h5"><br>
<br>
<br>
On 1/17/18 3:48 PM, Behdad Esfahbod wrote:<br>
</div>
</div>
</div>
<div>
<div class="m_-3592440659313515916h5">
<blockquote type="cite">
<div dir="ltr">What's the actual problem you are
facing?<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Jan 15, 2018 at
9:58 AM, Eric Muller <span dir="ltr"><<a href="mailto:emuller@amazon.com" target="_blank">emuller@amazon.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div class="m_-3592440659313515916m_4762254172372698478m_-4940559382455268948moz-cite-prefix"><span><br>
<blockquote type="cite">It's clear that
if the symbol font is asked by name,
we should do the shift.</blockquote>
</span> I think I disagree, in the sense
that HB should not impose that behavior on
it's clients. HB is clearly the right
place to implement the behavior, but the
choice of having that behavior or not
should be with the client.<br>
<br>
For any document format, rendering the
moral equivalent of <p
font-family='symbol'>A<<wbr>/p>
with something else that an "A" implies
that all ASCII is PUA. That's a choice
Word, InDesign, Notepad may make if they
want, but it should not be imposed on all
users of HB. <br>
<br>
Personally, I think it is a very bad
choice for HTML, and Firefox seems to
agree. It seems nice and user friendly at
first, but this makes the document
ambiguous. What about <p
font-family='minion,
symbol'>A</p>? It's an
A or not an A depending on the presence of
"minion" in the client. What does the
document mean?<br>
<br>
Of course, <p
font-family='symbol'><<wbr>/p>
should render with the glyph
symbol.cmap(F041). So even if the shift is
never done, the glyph is usable. It's just
that you don't have the convenience of an
IME-like mechanism provided by the shaping
engine, but you gain a reliable semantic
for the text.<br>
<br>
<blockquote type="cite">That's good
behavior [in Word], but beyond what
HarfBuzz can do.</blockquote>
Yes, which is why the shift may be
acceptable or even desirable for some
clients, and so hopefully the client could
choose.<span><br>
<br>
<blockquote type="cite">What would
clients do with that control then? How
would they set it?</blockquote>
</span> If I build an app that is meant to
work like other GDI apps, I allow the
shift (and may be add mitigating measures
like Word). If I build an app such as
Firefox, I don't allow it. The choice is
entirely driven by the type application I
want to build, and how I want to define my
document format.<br>
<br>
<br>
If you were to implement this choice, I
can see it either in the construction of
the HB unicode functions, or in the
hb_buffer (either globally, or one a
character by character basis). I have a
preference for the latter: this choice
could be passed down to the cmap lookup
functions, HB or not; it could also be
different on different parts of a
document, may be reacting to markup.<span class="m_-3592440659313515916m_4762254172372698478HOEnZb"><font color="#888888"><br>
<br>
Eric.</font></span>
<div>
<div class="m_-3592440659313515916m_4762254172372698478h5"><br>
<br>
<br>
On 1/15/18 6:46 AM, Behdad Esfahbod
wrote:<br>
</div>
</div>
</div>
<div>
<div class="m_-3592440659313515916m_4762254172372698478h5">
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_extra">Hi Eric,<br>
<br>
</div>
<div class="gmail_extra">
<div class="gmail_quote">On Mon,
Jan 15, 2018 at 2:25 AM, Eric
Muller <span dir="ltr"><<a href="mailto:emuller@amazon.com" target="_blank">emuller@amazon.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">It
seems that with a font that
has only a 3, 0 cmap subtable
(and may be some macintosh
subtables), then HB will
automatically do the shift by
F000 (in the function
get_glyph_from_symbol) for
code points below U+00FF that
are not mapped by the
subtable.<br>
</blockquote>
<div><br>
</div>
<div>Right. Only in hb-ot-func
though. Client font funcs can
do otherwise.<br>
<br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> It is
clear that when U+0041 A is
set with a symbol font, then
that U+0041 has actually the
semantics of a PUA code point,
and certainly should not be
treated as an "A". That's the
whole point of a 3,0 cmap
subtable.<br>
</blockquote>
<div><br>
</div>
<div>Correct.<br>
<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Consider an HTML page. The
font-family is only a request
and there is no guarantee that
the actual font will or will
not be a symbol font. Thus the
semantic of the HTML page can
change depending on the
browser environment. Outside a
browser, it seems that the
safe treatment is therefore to
consider all code points below
U+00FF as PUA, which is
clearly not tenable. So in
that environment, I think that
the shift should not be done.
Of course, U+F041 should work.<br>
</blockquote>
<div><br>
</div>
<div>My take on this is that
it's a bug of the font
fallback logic if it falls
back to a symbol font. I
changed fontconfig to never do
that.<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Note
that behavior of Word 2016 on
Windows is actually more
elaborate: enter U+0041, and
set it with a non-symbol font;
copy/paste or save to a text
file, and the result is
U+0041; but set this A in a
symbol font, and copy/paste or
save to a text file, and the
result is U+F041.<br>
</blockquote>
<div><br>
</div>
<div>That's good behavior, but
beyond what HarfBuzz can do.<br>
<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I
think that the shift should be
controllable by the client,
rather than systematically
applied. I don't have a strong
opinion about the default
behavior (i.e. when HB's
client does not specify
whether the shift should be
done or not).<br>
</blockquote>
<div><br>
</div>
<div>What would clients do with
that control then? How would
they set it?<br>
<br>
</div>
<div>I implemented this shift in
fontconfig and then harfbuzz
because in LibreOffice and
other software, there were
existing documents that
referred to windings or other
symbol fonts and encoding
characters in the ASCII range.
It's clear that if the symbol
font is asked by name, we
should do the shift. If it's
NOT, then it should not be
chosen to render text to begin
with, which means the shift
can be applied
unconditionally.<br>
<br>
</div>
<div>How does that sound?<br>
</div>
<div>behdad<br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Thoughts?<br>
<br>
Thanks,<br>
Eric.<br>
</blockquote>
<div> </div>
</div>
-- <br>
<div class="m_-3592440659313515916m_4762254172372698478m_-4940559382455268948gmail_signature" data-smartmail="gmail_signature">behdad<br>
<a href="http://behdad.org/" target="_blank">http://behdad.org/</a></div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div class="m_-3592440659313515916m_4762254172372698478gmail_signature" data-smartmail="gmail_signature">behdad<br>
<a href="http://behdad.org/" target="_blank">http://behdad.org/</a></div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div class="m_-3592440659313515916gmail_signature" data-smartmail="gmail_signature">behdad<br>
<a href="http://behdad.org/" target="_blank">http://behdad.org/</a></div>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">behdad<br><a href="http://behdad.org/" target="_blank">http://behdad.org/</a></div>
</div>