[Poppler-bugs] [Bug 66693] Greek support package - some characters output as symbols not letters

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Aug 17 06:49:09 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=66693

--- Comment #15 from Jason Crain <jason at aquaticape.us> ---
(In reply to comment #12)
> I don't think it makes sense, but even if i did, why would we do
> Πto OE
> but not
> Æ to AE
> ?

Because, for whatever reason, Adobe Reader doesn't touch Æ or æ.

You could also argue that the current pdftotext behavior for these math symbols
is correct because even if Reader changes them into Greek letters, the
characters are actually encoded in the document as math symbols.  I'm
ambivalent because I expect someone will complain either way.

> Would applying the other two patches actually fix this bug? Because you say
> they will fix searching but the bug is about pdftotext

The original description says that search doesn't work because of the
symbol/letter confusion.  I assumed Govert meant using the search feature in,
for example, Evince.  The way search works, TextPage::findText calls
unicodeNormalizeNFKC and searches through the normalized text.  These two
patches cause unicodeNormalizeNFKC to convert the math symbols to letters and
the search matches.  This works for Evince, anyway.  I haven't tested it with
Okular.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130817/e90b2b7c/attachment.html>


More information about the Poppler-bugs mailing list