[poppler] RFC: whole-page search in the qt4 frontend

Adam Reichold adamreichold at myopera.com
Sat Jun 30 06:24:09 PDT 2012


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

On 30.06.2012 14:52, Albert Astals Cid wrote:
> El Divendres, 29 de juny de 2012, a les 08:38:03, Adam Reichold va
> escriure: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
> 
> On 29.06.2012 08:23, Adam Reichold wrote:
>>>> On 29.06.2012 01:49, Albert Astals Cid wrote:
>>>>> El Dijous, 28 de juny de 2012, a les 17:53:45, Adam
>>>>> Reichold va escriure: Hello,
>>>>> 
>>>>> If I remember correctly, some time ago someone proposed
>>>>> caching the TextOuputDev/TextPage used in
>>>>> Poppler::Page::search to improve performance. Instead, I
>>>>> would propose to add another search method to Poppler::Page
>>>>> which searches the whole page at once and returns a list of
>>>>> all occurrences.
>>>>> 
>>>>> Applications using the qt4 frontend and this method could
>>>>> then decide whether to cache this information or not. The 
>>>>> implementation of the current search method would not
>>>>> change.
>>>>> 
>>>>> The appended patch does this. But the two search methods
>>>>> share some duplicate code. I am not sure what the best way
>>>>> to avoid this is.
>>>>> 
>>>>>> First concern, QRectF uses float (instead of double) in
>>>>>> some architectures, like ARM, so you are actually losing
>>>>>> precision (that's why the double variant of search()
>>>>>> exists). I'm not sure we should worry about that, but we
>>>>>> probably should. Imagine you get the list of matches with
>>>>>> the search() that returns the list and then try to use it
>>>>>> with the ::search() that accepts a QRectF (though
>>>>>> actually doesn't make much sense) to get the "next" item.
>>>>>> That will cause the float->double to go wrong and you
>>>>>> might always end up in the same item because of the
>>>>>> truncation.
>>>>>> 
>>>>>> On the other hand using a list of QRectF is much more 
>>>>>> convinient and probably has enough precision for
>>>>>> painting, so maybe we can just document that you should
>>>>>> not use the results of the ::search() that returns a list
>>>>>> as input for the other
>>>>>> 
>>>>>> ::search() variants?
>>>>>> 
>>>>>> Opinions?
>>>> 
>>>> Personally, I think that it would be nicer doing it that
>>>> way. Especially since you will still get a deprecation
>>>> warning if you call ::search with a QRectF as an argument.
>>>> 
>>>> Regards, Adam.
> 
> Updated the patch to include the warning.
> 
>> Pushed to master with a refactor so that most of the "setup" code
>> is shared.
> 
>> To be honest I have not tried it works but it should and i'll let
>> you confirm that it really does ;-)
> 
>> Cheers, Albert

Tested it and it seems to work as expected. Thanks for doing the
refactoring instead of handing the patch back for another roundtrip.

Best regards, Adam.

>>>>>> Albert
>>>>> 
>>>>> Testing this with some sample files shows large
>>>>> improvements (above 100% as measured by runtime) for
>>>>> searching the whole document and especially for short
>>>>> phrases that occur often.
>>>>> 
>>>>> Thanks for any comments and advice. Best regards, Adam. 
>>>>> _______________________________________________ poppler
>>>>> mailing list poppler at lists.freedesktop.org 
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>> 
>>>> _______________________________________________ poppler
>>>> mailing list poppler at lists.freedesktop.org 
>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
> 
> -

> _______________________________________________ poppler mailing
> list poppler at lists.freedesktop.org 
> http://lists.freedesktop.org/mailman/listinfo/poppler
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP7v35AAoJEPSSjE3STU34YzEIALFpDdLitNaNBJN2GYRnhNct
3FO4Nf7PhNcJnFrsm74F4o5UdQoTqq4UzqicxMq7ymWtnk9NmBmmeOznsP4G+1Ae
qqyn9Hyq7gP8/O0QvsNdP7UL3YPk2FHdj86i+cDqp4akan3yDajSQo21pdkmf0ib
CoDCeCTpP4Sv4DtBhZQFQxVlfaeessd30dY4kQPEhd3oyi+HH8CwcHLzzBXooxh/
modpLgWSYERb656nZ7j6QMXgrgdu0hE1y0elFP/vdF90zsJcoey/066H0jBVNq10
ays9KmuDkG3Uo1vYaQi60DcAnn5VNYXasKWsDz1LlIOGD7CO7C6jgtNLnlcO99o=
=30xw
-----END PGP SIGNATURE-----


More information about the poppler mailing list