[poppler] Regarding page search

amit aggarwal amitcs06 at gmail.com
Mon Mar 22 20:00:18 PDT 2010


thanks for your support .

On Tue, Mar 23, 2010 at 2:00 AM, Albert Astals Cid <aacid at kde.org> wrote:
> I've pushed an overload that accepts doubles instead of a QRectF and that
> should fix your problem.
>
> Albert
>
> A Dimarts, 16 de març de 2010, vàreu escriure:
>> Hello Albert,
>>
>> Thanks for your reply ,How are you ?
>>
>> I have one question regarding search of the word , this algorithm is
>> O(n^3+K) and if i want to search all hit of one page in that case its
>> O(n^4+K)  which making the  application very slow . And also
>> every-time its starting from the top and skiping upto the previous hit
>> means lot of comparison is waste.
>>
>> So If I will try to write one algorithm which will return all the
>> searchHit of one page will it be acceptable ?  and also can we improve
>> the performance of this search algorithm also ?
>>
>> On Tue, Mar 16, 2010 at 5:01 AM, Albert Astals Cid <aacid at kde.org> wrote:
>> > A Dilluns, 15 de març de 2010, amit aggarwal va escriure:
>> >> Yes it a problem in floating point comparison in findText algorithm .
>> >> As per my use cases its coming in
>> >>
>> >>       // check: is the line above the top limit?
>> >>       if (!startAtTop &&
>> >>         (backward ? line->yMin > yStart : line->yMin < yStart)) {
>> >>       continue;
>> >>       }
>> >> line->yMin <yStart but there might be a probability that problem will
>> >> come  in other comparison also like
>> >>
>> >>     // check: is the block above the top limit?
>> >>
>> >>     if (!startAtTop && (backward ? blk->yMin > yStart : blk->yMax <
>> >> yStart)) { continue;
>> >>     }
>> >>
>> >> blk->yMax < yStart  same for bottom limit check also.
>> >>
>> >> Please  let me know your comment  for the same
>> >
>> > Yeah, there is a ugly double<->float conversion hitting you there, Pino
>> > and me are looking for the fix that makes more sense, give us some days.
>> >
>> > Albert
>> >
>> >> On Mon, Mar 15, 2010 at 3:52 PM, amit aggarwal <amitcs06 at gmail.com>
> wrote:
>> >> > Hello Albert,
>> >> >
>> >> > I have done analysis and found the problem is in floating point
>> >> > comparison..
>> >> >
>> >> > yStart :226.279999 line->yMin: 226.280000 this is log where yStart is
>> >> > the previous searchHit result  and second new is giving 226.8 Effect
>> >> > of this is making
>> >> >
>> >> >      if (!startAtTop &&
>> >> >          (backward ? line->yMin > yStart : line->yMin < yStart)) {
>> >> >        continue;
>> >> >      }
>> >> > the above code or condition by-pass in findText TextOutputDev.cc so
>> >> > that its returning the same co-ordinate every-time.
>> >> >
>> >> > Looking forward for your comments
>> >> >
>> >> >
>> >> > Pasting some more logs  may be it will clear the more about the
>> >> > problem
>> >> >
>> >> > Debug: startSearch**
>> >> > Debug:
>> >> > ***************************searchPageForward***********************
>> >> >
>> >> > Debug: "PdfPageWidget:: Load time: 0.023000 s."
>> >> >
>> >> >  coming here constructor TextPage to make haveLastFind zero
>> >> > startAtLast:0 haveLastFind:0 *xMin: 0.000000,*yMin: 0.000000
>> >> > blk->yMax: 82.285000 yStart :0.000000line->yMin: 69.492000blk->yMax:
>> >> > 272.800000 yStart :0.000000line->yMin: 226.280000
>> >> > Making found true xMin1:391.900000 yMin1:226.280000 blk->yMax:
>> >> > 316.570000 yStart :0.000000line->yMin: 290.984000blk->yMax: 334.285000
>> >> > yStart :0.000000line->yMin: 321.492000blk->yMax: 359.685000 yStart
>> >> >
>> >> > :0.000000line->yMin: 346.892000blk->yMax: 377.685000 yStart
>> >> > :0.000000line->yMin: 364.892000blk->yMax: 425.154000 yStart
>> >> > :0.000000line->yMin: 409.530000blk->yMax: 542.485000 yStart
>> >> > :0.000000line->yMin: 516.892000line->yMin: 529.692000
>> >> >
>> >> > lastFindXMin: 391.900000 lastFindYMin: 226.280000 haveLastFind: 1
>> >> > *xMin 391.900000 *yMin 226.280000
>> >> > Debug: From Top
>> >> > Debug: **********TopPage: 1 qrect: QRectF(391.9,226.28 85.88x46.52)
>> >> > searchText: "Offic"
>> >> >
>> >> >  coming to here constructor TextPage to make haveLastFind zero
>> >> >
>> >> > startAtLast:1 haveLastFind:0 *xMin: 391.899994,*yMin: 226.279999
>> >> > blk->yMax: 82.285000 yStart :226.279999blk->yMax: 272.800000 yStart
>> >> >
>> >> > :226.279999line->yMin: 226.280000
>> >> >
>> >> > Making found true xMin1:391.900000 yMin1:226.280000 blk->yMax:
>> >> > 316.570000 yStart :226.279999line->yMin: 290.984000blk->yMax:
>> >> > 334.285000 yStart :226.279999line->yMin: 321.492000blk->yMax:
>> >> > 359.685000 yStart :226.279999line->yMin: 346.892000blk->yMax:
>> >> > 377.685000 yStart :226.279999line->yMin: 364.892000blk->yMax:
>> >> > 425.154000 yStart :226.279999line->yMin: 409.530000blk->yMax:
>> >> > 542.485000 yStart :226.279999line->yMin: 516.892000line->yMin:
>> >> > 529.692000
>> >> > lastFindXMin: 391.900000 lastFindYMin: 226.280000 haveLastFind: 1
>> >> > *xMin 391.900000 *yMin 226.280000
>> >> > Debug: Poppler1 Next sLeft 391.9 sTop 226.28 sRight 477.78 sBottom
>> >> > 272.8 Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> > 85.88x46.52) searchText: "Offic"
>> >> >
>> >> > On Fri, Mar 12, 2010 at 2:21 PM, Albert Astals Cid <aacid at kde.org>
> wrote:
>> >> >> A Divendres, 12 de març de 2010, amit aggarwal va escriure:
>> >> >>> Hello Albert,
>> >> >>> Thanks for your reply , Please do let me know if I can enable and
>> >> >>> send you log etc. so that you will do analysis, In the meanwhile I
>> >> >>> am also looking into the issue how to fix it ?
>> >> >>>
>> >> >>> Any knowledge or help if you want to share plz , bcoz that will help
>> >> >>> me to fix this issue quickly.
>> >> >>
>> >> >> I don't have any suggestion besides "debug the same code in the two
>> >> >> machines at the same time and see why it works in one and not in the
>> >> >> other".
>> >> >>
>> >> >> Albert
>> >> >>
>> >> >>> On Fri, Mar 12, 2010 at 1:05 AM, Albert Astals Cid <aacid at kde.org>
>> >
>> > wrote:
>> >> >>> > A Dijous, 11 de març de 2010, amit aggarwal va escriure:
>> >> >>> >> Hello All,
>> >> >>> >>
>> >> >>> >> As per the analysis , search is always starting form the top and
>> >> >>> >> every-time its getting first hit and returning the same
>> >> >>> >> co-ordinate.
>> >> >>> >>
>> >> >>> >> >>> >     while(mDocument->page(pageindex)->search(
>> >> >>> >> >>> >
>> >> >>> >> >>> >                searchText,
>> >> >>> >> >>> >                searchHit,
>> >> >>> >> >>> >                Poppler::Page::NextResult,
>> >> >>> >> >>> >                Poppler::Page::CaseInsensitive)) {
>> >> >>> >>
>> >> >>> >> So my question is how can i make it so that it will move to next
>> >> >>> >> hit or in other-way how can i make so that it will not always
>> >> >>> >> start from top of that page ?
>> >> >>> >>
>> >> >>> >> Looking forward for your help
>> >> >>> >
>> >> >>> > If it works on x86 and does not work on ARM it probably means
>> >> >>> > something is overflowing, can't help you since i don't have any
>> >> >>> > ARM gadget to play with.
>> >> >>> >
>> >> >>> > Sorry,
>> >> >>> >  Albert
>> >> >>> >
>> >> >>> >> On Thu, Mar 11, 2010 at 7:49 AM, amit aggarwal
>> >> >>> >> <amitcs06 at gmail.com>
>> >> >>
>> >> >> wrote:
>> >> >>> >> > Hello Albert,
>> >> >>> >> >
>> >> >>> >> > Thanks for your reply , Yes I saw the same observation its
>> >> >>> >> > workiing fine on normal PC but when I using the same in ARM
>> >> >>> >> > based m/c its returning me same co-ordinates eveytiime .Its
>> >> >>> >> > not moving to next hit even though that page have next hit.
>> >> >>> >> >
>> >> >>> >> > On Thu, Mar 11, 2010 at 4:15 AM, Albert Astals Cid
>> >> >>> >> > <aacid at kde.org>
>> >> >>
>> >> >> wrote:
>> >> >>> >> >> A Dimecres, 10 de març de 2010, amit aggarwal va escriure:
>> >> >>> >> >>> Hello,
>> >> >>> >> >>>
>> >> >>> >> >>> One observation which I noticed is that same page all
>> >> >>> >> >>> searchText below algorithm is working fine on x86 but its
>> >> >>> >> >>> returning same co-ordinate on arm processor.
>> >> >>> >> >>>
>> >> >>> >> >>> Please help me is there any problem in page search API or I
>> >> >>> >> >>> am not using it in correct way ?
>> >> >>> >> >>
>> >> >>> >> >> You mean the code works in a regular PC but fails in an arm
>> >> >>> >> >> based machine?
>> >> >>> >> >>
>> >> >>> >> >> Albert
>> >> >>> >> >>
>> >> >>> >> >>> On Wed, Mar 10, 2010 at 6:03 PM, amit aggarwal
>> >> >>> >> >>> <amitcs06 at gmail.com>
>> >> >>> >
>> >> >>> > wrote:
>> >> >>> >> >>> > Hi All,
>> >> >>> >> >>> >
>> >> >>> >> >>> > I am using page search API in different thread to search
>> >> >>> >> >>> > the word of that particular page. But I am getting one
>> >> >>> >> >>> > problem that its returning same co-ordinate every time
>> >> >>> >> >>> > even though that page is containing more than 2 search hit
>> >> >>> >> >>> > also. So that while loop is never ending. Please help me
>> >> >>> >> >>> > for the same and let me know If I am doing something
>> >> >>> >> >>> > wrong.
>> >> >>> >> >>> >
>> >> >>> >> >>> > I am attaching my code snippets and log also.
>> >> >>> >> >>> >
>> >> >>> >> >>> >     QRectF searchHit;
>> >> >>> >> >>> >     while(mDocument->page(pageindex)->search(
>> >> >>> >> >>> >                searchText,
>> >> >>> >> >>> >                searchHit,
>> >> >>> >> >>> >                Poppler::Page::NextResult,
>> >> >>> >> >>> >                Poppler::Page::CaseInsensitive)) {
>> >> >>> >> >>> >
>> >> >>> >> >>> >
>> >> >>> >> >>> > qDebug()<<"**********Page:"<<pageIndex+1<<"qrect:"<<searchH
>> >> >>> >> >>> > it< <"se arc hTex t:"<<searchText;
>> >> >>> >> >>> > (mResults[page]).append(searchHit); }
>> >> >>> >> >>> >
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> > Debug: **********Page: 1 qrect: QRectF(391.9,226.28
>> >> >>> >> >>> > 110.48x46.52) searchText: "Office"
>> >> >>> >> >>> >
>> >> >>> >> >>> >
>> >> >>> >> >>> >
>> >> >>> >> >>> > --
>> >> >>> >> >>> > Thanks
>> >> >>> >> >>> > Amit Aggarwal
>> >> >>> >> >>
>> >> >>> >> >> _______________________________________________
>> >> >>> >> >> poppler mailing list
>> >> >>> >> >> poppler at lists.freedesktop.org
>> >> >>> >> >> http://lists.freedesktop.org/mailman/listinfo/poppler
>> >> >>> >> >
>> >> >>> >> > --
>> >> >>> >> > Thanks
>> >> >>> >> > Amit Aggarwal
>> >> >>> >
>> >> >>> > _______________________________________________
>> >> >>> > poppler mailing list
>> >> >>> > poppler at lists.freedesktop.org
>> >> >>> > http://lists.freedesktop.org/mailman/listinfo/poppler
>> >> >>
>> >> >> _______________________________________________
>> >> >> poppler mailing list
>> >> >> poppler at lists.freedesktop.org
>> >> >> http://lists.freedesktop.org/mailman/listinfo/poppler
>> >> >
>> >> > --
>> >> > Thanks
>> >> > Amit Aggarwal
>> >
>> > _______________________________________________
>> > poppler mailing list
>> > poppler at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/poppler
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
>



-- 
Thanks
Amit Aggarwal


More information about the poppler mailing list