[Poppler-bugs] [Bug 85702] New: Not able to find (only) ordinals (=?UTF-8?Q?=C2=AA?=, =?UTF-8?Q?=C2=BA?=, ...)
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Fri Oct 31 08:57:29 PDT 2014
https://bugs.freedesktop.org/show_bug.cgi?id=85702
Bug ID: 85702
Summary: Not able to find (only) ordinals (ª,º,...)
Product: poppler
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: medium
Component: general
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: frederic.moenne-loccoz at sfr.fr
Created attachment 108731
--> https://bugs.freedesktop.org/attachment.cgi?id=108731&action=edit
patch to poppler library
Hello,
I want to open this bug from gnome Evince users who asked the following thing:
distinguish ordinal characters during text search. Here are the characters:
0x207F=ⁿ,0x00AA=ª,0xBA=º,0xB9=¹,0xB2=²
,0xB3=³,0x2074=⁴,0x2075=⁵,0x2076=⁶,0x2077=⁷,0x2078=⁸,0x2079=⁹,..
Besides, these ones are considered like their standard numeric figure, for
example º=0, ª =a,
that work up false search result with too many characters.
This bug has been opened in gnome bugzilla (see bug #429985), this issue has
been veted and the conclusion of our search is the following:
These characters are replaced because of normalization compatibility
decomposition applying on characters strings in poppler library.
Here is the solution that i propose: exclude ordinal characters from NFKC
normalization as indicated in my patch.
Perhaps it could be better to create a fonction to exclude a character from
normalization process getting its unicode code point as a parameter. That would
allow easy configuration for poppler library users.
I hope it helps. Thanks for your excellent work,
Frederic
Gnome Software Contributor
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20141031/4c4d226f/attachment.html>
More information about the Poppler-bugs
mailing list