[Libreoffice-bugs] [Bug 117428] add an option to PDF export dialog to do ActualText per word

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Tue Jul 20 15:05:47 UTC 2021


https://bugs.documentfoundation.org/show_bug.cgi?id=117428

--- Comment #21 from V Stuart Foote <vstuart.foote at utsa.edu> ---
(In reply to stragu from comment #20)

> Not sure if something changed in PDF export along the way? Could you please
> test again with a recent version of LO?

Hmm, strange. With STR of OP with Writer 7.3.0alpha export to PDF. Opened in
Acrobat Reader (ver 2021.005.20058) and copy to Notepad++ (bld 7.9.5) in UTF+8
encoding--I get exactly the same misformed Devanagari 

The glyph clusters are not formed correctly, so the words can not be copied out
of the PDF.

The /ActualText structures when present would supplement the incorrect
ToUnicode strings that drop lexical details.  Parsing the actual text runs
would, if done at Unicode word bound iterators, provide better fidelity to
original text when enabled and embedded into the PDF export.

=-testing-=
Version: 7.3.0.0.alpha0+ (x64) / LibreOffice Community
Build ID: 213430e0bdac0786b30a76a68b43d35647e93912
CPU threads: 8; OS: Windows 10.0 Build 19043; UI render: Skia/Vulkan; VCL: win
Locale: en-US (en_US); UI: en-US
Calc: threaded

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210720/f901f9ae/attachment.htm>


More information about the Libreoffice-bugs mailing list