[poppler] RFC: whole-page search in the qt4 frontend
Adam Reichold
adamreichold at myopera.com
Sat Jun 30 06:24:09 PDT 2012
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
On 30.06.2012 14:52, Albert Astals Cid wrote:
> El Divendres, 29 de juny de 2012, a les 08:38:03, Adam Reichold va
> escriure: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>
> On 29.06.2012 08:23, Adam Reichold wrote:
>>>> On 29.06.2012 01:49, Albert Astals Cid wrote:
>>>>> El Dijous, 28 de juny de 2012, a les 17:53:45, Adam
>>>>> Reichold va escriure: Hello,
>>>>>
>>>>> If I remember correctly, some time ago someone proposed
>>>>> caching the TextOuputDev/TextPage used in
>>>>> Poppler::Page::search to improve performance. Instead, I
>>>>> would propose to add another search method to Poppler::Page
>>>>> which searches the whole page at once and returns a list of
>>>>> all occurrences.
>>>>>
>>>>> Applications using the qt4 frontend and this method could
>>>>> then decide whether to cache this information or not. The
>>>>> implementation of the current search method would not
>>>>> change.
>>>>>
>>>>> The appended patch does this. But the two search methods
>>>>> share some duplicate code. I am not sure what the best way
>>>>> to avoid this is.
>>>>>
>>>>>> First concern, QRectF uses float (instead of double) in
>>>>>> some architectures, like ARM, so you are actually losing
>>>>>> precision (that's why the double variant of search()
>>>>>> exists). I'm not sure we should worry about that, but we
>>>>>> probably should. Imagine you get the list of matches with
>>>>>> the search() that returns the list and then try to use it
>>>>>> with the ::search() that accepts a QRectF (though
>>>>>> actually doesn't make much sense) to get the "next" item.
>>>>>> That will cause the float->double to go wrong and you
>>>>>> might always end up in the same item because of the
>>>>>> truncation.
>>>>>>
>>>>>> On the other hand using a list of QRectF is much more
>>>>>> convinient and probably has enough precision for
>>>>>> painting, so maybe we can just document that you should
>>>>>> not use the results of the ::search() that returns a list
>>>>>> as input for the other
>>>>>>
>>>>>> ::search() variants?
>>>>>>
>>>>>> Opinions?
>>>>
>>>> Personally, I think that it would be nicer doing it that
>>>> way. Especially since you will still get a deprecation
>>>> warning if you call ::search with a QRectF as an argument.
>>>>
>>>> Regards, Adam.
>
> Updated the patch to include the warning.
>
>> Pushed to master with a refactor so that most of the "setup" code
>> is shared.
>
>> To be honest I have not tried it works but it should and i'll let
>> you confirm that it really does ;-)
>
>> Cheers, Albert
Tested it and it seems to work as expected. Thanks for doing the
refactoring instead of handing the patch back for another roundtrip.
Best regards, Adam.
>>>>>> Albert
>>>>>
>>>>> Testing this with some sample files shows large
>>>>> improvements (above 100% as measured by runtime) for
>>>>> searching the whole document and especially for short
>>>>> phrases that occur often.
>>>>>
>>>>> Thanks for any comments and advice. Best regards, Adam.
>>>>> _______________________________________________ poppler
>>>>> mailing list poppler at lists.freedesktop.org
>>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>>>>
>>>> _______________________________________________ poppler
>>>> mailing list poppler at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/poppler
>
> -
> _______________________________________________ poppler mailing
> list poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJP7v35AAoJEPSSjE3STU34YzEIALFpDdLitNaNBJN2GYRnhNct
3FO4Nf7PhNcJnFrsm74F4o5UdQoTqq4UzqicxMq7ymWtnk9NmBmmeOznsP4G+1Ae
qqyn9Hyq7gP8/O0QvsNdP7UL3YPk2FHdj86i+cDqp4akan3yDajSQo21pdkmf0ib
CoDCeCTpP4Sv4DtBhZQFQxVlfaeessd30dY4kQPEhd3oyi+HH8CwcHLzzBXooxh/
modpLgWSYERb656nZ7j6QMXgrgdu0hE1y0elFP/vdF90zsJcoey/066H0jBVNq10
ays9KmuDkG3Uo1vYaQi60DcAnn5VNYXasKWsDz1LlIOGD7CO7C6jgtNLnlcO99o=
=30xw
-----END PGP SIGNATURE-----
More information about the poppler
mailing list