[poppler] [Poppler] Bug in your text matching routine
Ed Catmur
ed at catmur.co.uk
Wed Aug 29 14:27:53 PDT 2007
On Mon, 2007-08-27 at 20:38 +0200, Albert Astals Cid wrote:
> A Dilluns 27 Agost 2007, Ed Catmur va escriure:
> > On Sun, 2007-08-26 at 21:55 +0200, Albert Astals Cid wrote:
> > > The problem is, that searching "a" in the attached document (that only
> > > contains "ä") returns true but the returned container rectangle is 0
> > > pixels width.
> > Oops. Stupid error, patch attached.
> That was fast! Thanks a lot. You rock :-)
Unfortunately that's not the end of the story.
Create a pdf containing the following (OOo Writer works well):
Offler's offer of offices offended.
(Note the use of ff, ffi, ffl ligatures; this also happens with
fractions (½), etc.)
Now search for of, off, offi, offl, f, ff, ffi, ffl etc.
Question: where do we want to draw the match box when a
search /partially/ matches a compatibility decomposition? The current
code evidently draws it at the start of the compatibility character;
other options I've thought of (in order of increasing complexity) are:
1. at the end of the compatibility character
2. exactly halfway through the compatibility character
3. as far through as the match constitutes of the compatibility
decomposition (e.g. 2/3 through when matching 'ff' of FFI LIGATURE)
3. seems the most elegant, but could be a little complex to implement
and may not always be the right solution (RTL, zero-width characters,
etc.)
Thoughts?
Ed
More information about the poppler
mailing list