[Libreoffice-bugs] [Bug 124191] Text copied from a PDF exported using Linux Libertine G is missing characters.

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Wed Mar 20 21:59:46 UTC 2019


https://bugs.documentfoundation.org/show_bug.cgi?id=124191

--- Comment #14 from V Stuart Foote <vstuart.foote at utsa.edu> ---
(In reply to Frank Zimmerman from comment #9)
> I've attached three sample outputs using PDF and XPS printer drivers. These
> all have the same problems, when I try to copy text from them and paste into
> a text editor. This shows that it's not strictly related to the PDF output
> routines in the LibreOffice PDF export.

No, that is correct. If you are able to set the text editor to use Linux
Libertine G you probably would not see corruption. Not certain, but I think the
issue was with your LibreOffice's generation of the /ToUnicode mappings.

Below are the /ToUnicode charts for the MS PDF and Phantom PDF for the test
string "The fire flying coffee left Quickly"

With Linux Libertine G is encoded in both PDF generators as

'<E049>e <FB01>re <FB02>ying co<FB00>ee le<E039> <E048>ickly.'

glyph positioning omitted of course.

When that text is copied to a text editor--the text editor must able to be set
to use the same Linux Libertine G font. If not the PUA glyphs (E039, E048,
E049) can not be rendered, the same for needing to support the Unicode
Alphabetic Presentation Forms (here just FB00-ff, FB01-fi, FB02-fl).

Both PDF generators look to have correctly generated /ToUnicode charts, the PUA
are mapped and with a text editor using Linux Libertine G (I prefer BablePad
for this)--the strings are fully rendered. Notice the sequence of glyphs added
to the /ToUnicode chart depends on the PDF generator.

MS PDF
<0003> <0020> <sp>
<0011> <002E> .
<0046> <0063> c
<0048> <0065> e
<004A> <0067> g
<004C> <0069> i
<004E> <006B> k
<004F> <006C> l
<0051> <006E> n
<0052> <006F> o
<0055> <0072> r
<005C> <0079> y
<093D> <E039> ft
<094C> <E048> Qu
<094D> <E049> Th
<0A98> <FB00> ff
<0A99> <FB01> fi
<0A9A> <FB02> fl


Phantom PDF
<0001> <E049> Th
<0002> <0065> e 
<0003> <0020> <sp>
<0004> <FB01> fi
<0005> <0072> r
<0006> <FB02> fl
<0007> <0079> y
<0008> <0069> i
<0009> <006E> n
<000A> <0067> g
<000B> <0063> c
<000C> <006F> o
<000D> <FB00> ff
<000E> <006C> l
<000F> <E039> ft
<0010> <E048> Qu
<0011> <006B> k
<0012> <002E> .

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20190320/d08d8cd5/attachment-0001.html>


More information about the Libreoffice-bugs mailing list