[poppler] [Poppler] Bug in your text matching routine

James Cloos cloos+fd-poppler at jhcloos.com
Wed Aug 29 23:23:58 PDT 2007


>>>>> "Ed" == Ed Catmur <ed at catmur.co.uk> writes:

Ed> Question: where do we want to draw the match box when a
Ed> search /partially/ matches a compatibility decomposition?

Ed> 1. at the end of the compatibility character
Ed> 2. exactly halfway through the compatibility character
Ed> 3. as far through as the match constitutes of the compatibility
Ed> decomposition (e.g. 2/3 through when matching 'ff' of FFI LIGATURE)

Ed> 3. seems the most elegant, but could be a little complex to implement
Ed> and may not always be the right solution (RTL, zero-width characters, etc.)

I'd vote for getting 1 in for now and only then spending any time on
implementing 3.  It may even be the better option overall.

As you say, 3 will be quite complex when dealing with the scripts which
require shaping engines or syllable-per-glyph scripts like Hangeul, if
you allow searching for syllable components.

With some of the scripts you would even need disjoint match boxes.

Even in cases where the syllable block isn't a single glyph it might be
better to highlight the whole thing rather than just the matched pieces.

-JimC
-- 
James Cloos <cloos at jhcloos.com>         OpenPGP: 1024D/ED7DAEA6


More information about the poppler mailing list