[Poppler-bugs] [Bug 66693] Greek support package - some characters output as symbols not letters

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sun Aug 25 12:58:10 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=66693

--- Comment #19 from Albert Astals Cid <aacid at kde.org> ---
To be honest, i don't see why pdftotext should output a symbol as another
symbol, unless it's obvious that the first symbol is *exclusively* there for a
typographical nature, like the "fl", "fi", ligatures.

OTOH if the code is not a lot to maintain I would not be opposed to add a non
default option that did that conversion.

About searching, yes, i agree it makes sense that if you search for Symbol1 and
what's on the pdf is Symbol2 (but that is "technically" the same thing), it
would make sense sense that the search algorithm tries to match it, but I would
still want the "getPageText()" methods to give me Symbol2 (i.e. what was really
on the pdf file).

So as far as I can see here there's two thigs happening in this bug:
 a) pdftotext doing conversion of some symbols to others
 b) search handling symbol mappings

Am I right in the analysis?

Now my question, how much is a) related to b). Can it be handled in different
bugs or it makes more sense to handle them together here?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130825/d61c9450/attachment.html>


More information about the Poppler-bugs mailing list