[Poppler-bugs] [Bug 101855] Embedded TrueType Symbols with accents not rendered correctly

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Aug 2 20:58:02 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=101855

--- Comment #15 from Albert Astals Cid <aacid at kde.org> ---
(In reply to Thomas Freitag from comment #13)
> Created attachment 133170 [details] [review]
> Use unicode cmap if it exists
> 
> This patch solves this bug and also bug 101624. And looking at the PDF32000
> spec section 9.6.6.4 I can't really decide "If a (3, 1) “cmap” subtable
> (Microsoft Unicode) is present" should only be done if the font doesn't
> specify MacRomanEncoding or always!

So the text says

If the font has a named Encoding entry of either MacRomanEncoding or
WinAnsiEncoding, or if the font descriptor’s Nonsymbolic flag (see Table 123)
is set, the conforming reader shall create a table that maps from character
codes to glyph names:
 * If the Encoding entry is one of the names MacRomanEncoding or
WinAnsiEncoding, the table shall be initialized with the mappings described in
Annex D.
 * If the Encoding entry is a dictionary, the table shall be initialized with
the entries from the dictionary’s BaseEncoding entry (see Table 114). Any
entries in the Differences array shall be used to update the table. Finally,
any undefined entries in the table shall be filled using StandardEncoding.

If a (3, 1) “cmap” subtable (Microsoft Unicode) is present:
 * A character code shall be first mapped to a glyph name using the table
described above.
 * The glyph name shall then be mapped to a Unicode value by consulting the
Adobe Glyph List (see the Bibliography).
 + Finally, the Unicode value shall be mapped to a glyph description according
to the (3, 1) subtable.

If no (3, 1) subtable is present but a (1, 0) subtable (Macintosh Roman) is
present:
 * A character code shall be first mapped to a glyph name using the table
described above.
 * The glyph name shall then be mapped back to a character code according to
the standard Roman encoding used on Mac OS.
 * Finally, the code shall be mapped to a glyph description according to the
(1, 0) subtable.

In any of these cases, if the glyph name cannot be mapped as specified, the
glyph name shall be looked up in the font program’s “post” table (if one is
present) and the associated glyph description shall be used.

The standard Roman encoding that is used on Mac OS is the same as the
MacRomanEncoding described in Annex D, with the addition of 15 entries and the
replacement of the currency glyph with the Euro glyph, as shown in Table 115.


********

My understanding of the first point of the "(3, 1) “cmap” subtable" section",
i.e. "A character code shall be first mapped to a glyph name using the table
described above." is that it should always use the table described in the first
paragraph "the conforming reader shall create a table", which if there is no
"MacRomanEncoding or WinAnsiEncoding" fallsback to the second point of the
first paragraph "If the Encoding entry is a dictionary", no?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20170802/21ace0ae/attachment.html>


More information about the Poppler-bugs mailing list