[poppler] How to read textbox positions?

Albert Astals Cid aacid at kde.org
Mon Feb 26 18:46:23 UTC 2018


El dilluns, 26 de febrer de 2018, a les 6:47:41 CET, suzuki toshiya va 
escriure:
> Dear Albert,
> 
> Sorry for keeping you troubled about this.
> Have I overlooked any homework assigned to me?

No, i'm just being lazy/busy. I'll process the patch at some point.

Sorry for being late.

Cheers,
  Albert

> Please tell me what made this patch stopped...
> 
> Regards,
> mpsuzuki
> 
> suzuki toshiya wrote:
> > Dear Albert,
> > 
> >> Can you please try adding some documentation in the cpp file like the
> >> other
> >> functions have?
> > 
> > Oh, sorry for overlooking that task. Here it is...
> > 
> > Regards,
> > mpsuzuki
> > 
> > Albert Astals Cid wrote:
> >> El diumenge, 28 de gener de 2018, a les 13:24:45 CET, suzuki toshiya va
> >> 
> >> escriure:
> >>> Dear Albert,
> >>> 
> >>>> As far as i can see we don't use exceptions anywhere in the poppler
> >>>> code,
> >>>> so i would not introduce them "just for this".
> >>> 
> >>> Oh, I'm sorry for my overlooking that. Here is updated patch which
> >>> checks
> >>> out-of-bound value to char_bbox() and return rect(0,0,0,0) if bad index
> >>> is passed. Also I changed the type of the index from int to size_t.
> >> 
> >> Can you please try adding some documentation in the cpp file like the
> >> other
> >> functions have?
> >> 
> >> Cheers,
> >> 
> >>   Albert
> >>> 
> >>> Regards,
> >>> mpsuzuki
> >>> 
> >>> Albert Astals Cid wrote:
> >>>> El dissabte, 27 de gener de 2018, a les 17:17:37 CET, suzuki toshiya va
> >>>> 
> >>>> escriure:
> >>>>> Dear Albert,
> >>>>> 
> >>>>> Oops, I apologize that I overlooked your response to
> >>>>> improve the patch.
> >>>>> 
> >>>>>> You have one text_box where you have 17 char bboxes but the
> >>>>>> ustr.to_utf8().size() is 27, so if you iterate it's easy to get a
> >>>>>> crash
> >>>>>> because you get an out of bounds.
> >>>>> 
> >>>>> Indeed, it might be common scenarios for the
> >>>>> programmers assuming the input as ASCII data but
> >>>>> something different is given.
> >>>>> 
> >>>>>> So it would be nice to make
> >>>>>> rectf     char_bbox(int i) const;
> >>>>>> not crash on an invalid i, i guess return a 0x0 rect
> >>>>>> and then probably add some documentation about it?
> >>>>> 
> >>>>> Checking the features of C++ std::vector, I think
> >>>>> we have 2 options:
> >>>>> 
> >>>>> a) use (as current patch) m_data->char_bbox[i] and
> >>>>> sanitize the i-index before using it (as you proposed),
> >>>>> and the bad values cause 0x0 rectf.
> >>>>> 
> >>>>> b) use m_data->char_bbox.at(i) and let stdlib raise
> >>>>> std::out_of_range error.
> >>>>> 
> >>>>> which is better for C++ programmers?
> >>>> 
> >>>> As far as i can see we don't use exceptions anywhere in the poppler
> >>>> code,
> >>>> so i would not introduce them "just for this".
> >>>> 
> >>>> Cheers,
> >>>> 
> >>>>   Albert
> >>>>   
> >>>>>> FWIW this was a real problem in Okular and i fixed it with
> >>>>>> https://cgit.kde.org/okular.git/commit/generators/poppler/generator_p
> >>>>>> df.
> >>>>>> cp
> >>>>>> p?id=dcf0e78227959a52300d8f253c4b1058b3e81567
> >>>>>> 
> >>>>>> Am i making any sense here?
> >>>>> 
> >>>>> Thank you very much for the important suggestion and
> >>>>> sorry for delayed reponse!
> >>>>> 
> >>>>> Regards,
> >>>>> mpsuzuki
> >>>>> 
> >>>>> Albert Astals Cid wrote:
> >>>>>> El dimecres, 3 de gener de 2018, a les 21:10:06 CET, Adam Reichold va
> >>>> 
> >>>> escriure:
> >>>>>>> Hello mpsuzuki,
> >>>>>>> 
> >>>>>>> I do not think you need the two indirections: Since you want to keep
> >>>>>>> that m_data indirection for ABI compatibility, just return
> >>>>>>> std::vector<text_box> as text_box basically "is" a unique_ptr to its
> >>>>>>> data with the additional getter methods. (You just need to make sure
> >>>>>>> to
> >>>>>>> default the move constructor and assignment operator since no
> >>>>>>> copying
> >>>>>>> implies no moving by default.)
> >>>>>>> 
> >>>>>>> Attached is an update patch along those lines. (I also took the
> >>>>>>> liberty
> >>>>>>> to replace some bare deletes by use of unique_ptr in the
> >>>>>>> implementation
> >>>>>>> of page::text_list as well.)
> >>>>>> 
> >>>>>> One thing that i wanted to improve in the Qt5 frontend but never go
> >>>>>> to,
> >>>>>> maybe we can start here is the fact that the number of boxes is "a
> >>>>>> bit
> >>>>>> random", for example take
> >>>>>> https://bugs.kde.org/attachment.cgi?id=66189
> >>>>>> 
> >>>>>> You have one text_box where you have 17 char bboxes but the
> >>>>>> ustr.to_utf8().size() is 27, so if you iterate it's easy to get a
> >>>>>> crash
> >>>>>> because you get an out of bounds.
> >>>>>> 
> >>>>>> So it would be nice to make
> >>>>>> rectf     char_bbox(int i) const;
> >>>>>> not crash on an invalid i, i guess return a 0x0 rect
> >>>>>> and then probably add some documentation about it?
> >>>>>> 
> >>>>>> FWIW this was a real problem in Okular and i fixed it with
> >>>>>> https://cgit.kde.org/okular.git/commit/generators/poppler/generator_p
> >>>>>> df.
> >>>>>> cp
> >>>>>> p?id=dcf0e78227959a52300d8f253c4b1058b3e81567
> >>>>>> 
> >>>>>> Am i making any sense here?
> >>>>>> 
> >>>>>> Cheers,
> >>>>>> 
> >>>>>>   Albert
> >>>>>>> 
> >>>>>>> Best regards, Adam.
> >>>>>>> 
> >>>>>>> Am 03.01.2018 um 07:38 schrieb suzuki toshiya:
> >>>>>>>> Dear Adam,
> >>>>>>>> 
> >>>>>>>> Thank you for the technical information on the difference of
> >>>>>>>> std::shared_ptr and std::unique_ptr. Just I've tried to replace
> >>>>>>>> the vector of the pointers to that of the unique_ptrs, and
> >>>>>>>> remove the next_xxx methods.
> >>>>>>>> 
> >>>>>>>> I attached the revised patch (gzipped) to the master, and
> >>>>>>>> here I quote the revised part from my last revision, for
> >>>>>>>> convenience to take a look on it.
> >>>>>>>> 
> >>>>>>>> Regards,
> >>>>>>>> mpsuzuki
> >>>>>>>> 
> >>>>>>>> diff --git a/cpp/poppler-private.h b/cpp/poppler-private.h
> >>>>>>>> index 726068a..2cedd1e 100644
> >>>>>>>> --- a/cpp/poppler-private.h
> >>>>>>>> +++ b/cpp/poppler-private.h
> >>>>>>>> @@ -72,12 +72,11 @@ class text_box_data
> >>>>>>>> 
> >>>>>>>>  {
> >>>>>>>>  
> >>>>>>>>  public:
> >>>>>>>>      text_box_data()
> >>>>>>>> 
> >>>>>>>> -    : next_text_box(0), has_space_after(false)
> >>>>>>>> +    : has_space_after(false)
> >>>>>>>> 
> >>>>>>>>      {
> >>>>>>>>      }
> >>>>>>>>      ustring text;
> >>>>>>>>      rectf bbox;
> >>>>>>>> 
> >>>>>>>> -    text_box *next_text_box;
> >>>>>>>> 
> >>>>>>>>      std::vector<rectf> char_bboxes;
> >>>>>>>>      bool has_space_after;
> >>>>>>>>  
> >>>>>>>>  };
> >>>>>>>> 
> >>>>>>>> diff --git a/cpp/poppler-page.h b/cpp/poppler-page.h
> >>>>>>>> index 18af213..3013997 100644
> >>>>>>>> --- a/cpp/poppler-page.h
> >>>>>>>> +++ b/cpp/poppler-page.h
> >>>>>>>> @@ -22,6 +22,8 @@
> >>>>>>>> 
> >>>>>>>>  #include "poppler-global.h"
> >>>>>>>>  #include "poppler-rectangle.h"
> >>>>>>>> 
> >>>>>>>> +#include <memory>
> >>>>>>>> +
> >>>>>>>> 
> >>>>>>>>  namespace poppler
> >>>>>>>>  {
> >>>>>>>> 
> >>>>>>>> @@ -33,12 +35,10 @@ class POPPLER_CPP_EXPORT text_box {
> >>>>>>>> 
> >>>>>>>>  	~text_box();
> >>>>>>>>  	ustring   text() const;
> >>>>>>>>  	rectf     bbox() const;
> >>>>>>>> 
> >>>>>>>> -	text_box *next_text_box() const;
> >>>>>>>> -	text_box *next_word() { return this->next_text_box(); };
> >>>>>>>> 
> >>>>>>>>  	rectf     char_bbox(int i) const;
> >>>>>>>>  	bool      has_space_after() const;
> >>>>>>>>  	
> >>>>>>>>      private:
> >>>>>>>> -	text_box_data* m_data;
> >>>>>>>> +	std::unique_ptr<text_box_data> m_data;
> >>>>>>>> 
> >>>>>>>>  };
> >>>>>>>>  
> >>>>>>>>  class document;
> >>>>>>>> 
> >>>>>>>> @@ -79,7 +79,7 @@ public:
> >>>>>>>>      ustring text(const rectf &rect = rectf()) const;
> >>>>>>>>      ustring text(const rectf &rect, text_layout_enum layout_mode)
> >>>>>>>>      const;
> >>>>>>>> 
> >>>>>>>> -    std::vector<text_box*> text_list(rotation_enum rotation)
> >>>>>>>> const;
> >>>>>>>> +    std::vector<std::unique_ptr<text_box>> text_list(rotation_enum
> >>>>>>>> rotation) const;>
> >>>>>>>> 
> >>>>>>>>  private:
> >>>>>>>>      page(document_private *doc, int index);
> >>>>>>>> 
> >>>>>>>> diff --git a/cpp/poppler-page.cpp b/cpp/poppler-page.cpp
> >>>>>>>> index 9461ab9..5da2e1d 100644
> >>>>>>>> --- a/cpp/poppler-page.cpp
> >>>>>>>> +++ b/cpp/poppler-page.cpp
> >>>>>>>> @@ -291,14 +291,13 @@ ustring page::text(const rectf &r,
> >>>>>>>> text_layout_enum
> >>>>>>>> layout_mode) const
> >>>>>>>> 
> >>>>>>>>   */
> >>>>>>>>  
> >>>>>>>>  text_box::text_box(const ustring& text, const rectf &bbox)
> >>>>>>>>  {
> >>>>>>>> 
> >>>>>>>> -    m_data = new text_box_data();
> >>>>>>>> +    m_data = std::unique_ptr<text_box_data>(new text_box_data());
> >>>>>>>> 
> >>>>>>>>      m_data->text = text;
> >>>>>>>>      m_data->bbox = bbox;
> >>>>>>>>  
> >>>>>>>>  }
> >>>>>>>>  
> >>>>>>>>  text_box::~text_box()
> >>>>>>>>  {
> >>>>>>>> 
> >>>>>>>> -    delete m_data;
> >>>>>>>> 
> >>>>>>>>  }
> >>>>>>>>  
> >>>>>>>>  ustring text_box::text() const
> >>>>>>>> 
> >>>>>>>> @@ -311,11 +310,6 @@ rectf text_box::bbox() const
> >>>>>>>> 
> >>>>>>>>      return m_data->bbox;
> >>>>>>>>  
> >>>>>>>>  }
> >>>>>>>> 
> >>>>>>>> -text_box* text_box::next_text_box() const
> >>>>>>>> -{
> >>>>>>>> -    return m_data->next_text_box;
> >>>>>>>> -}
> >>>>>>>> -
> >>>>>>>> 
> >>>>>>>>  rectf text_box::char_bbox(int i) const
> >>>>>>>>  {
> >>>>>>>>  
> >>>>>>>>      return m_data->char_bboxes[i];
> >>>>>>>> 
> >>>>>>>> @@ -326,10 +320,10 @@ bool text_box::has_space_after() const
> >>>>>>>> 
> >>>>>>>>      return m_data->has_space_after;
> >>>>>>>>  
> >>>>>>>>  }
> >>>>>>>> 
> >>>>>>>> -std::vector<text_box*> page::text_list(rotation_enum rotate) const
> >>>>>>>> +std::vector<std::unique_ptr<text_box>>
> >>>>>>>> page::text_list(rotation_enum
> >>>>>>>> rotate) const>
> >>>>>>>> 
> >>>>>>>>  {
> >>>>>>>>  
> >>>>>>>>      TextOutputDev *output_dev;
> >>>>>>>> 
> >>>>>>>> -    std::vector<text_box*>  output_list;
> >>>>>>>> +    std::vector<std::unique_ptr<text_box>>  output_list;
> >>>>>>>> 
> >>>>>>>>      const int rotation_value = (int)rotate * 90;
> >>>>>>>>      
> >>>>>>>>      /* config values are same with Qt5 Page::TextList() */
> >>>>>>>> 
> >>>>>>>> @@ -366,7 +360,7 @@ std::vector<text_box*>
> >>>>>>>> page::text_list(rotation_enum
> >>>>>>>> rotate) const
> >>>>>>>> 
> >>>>>>>>  	double xMin, yMin, xMax, yMax;
> >>>>>>>>  	word->getBBox(&xMin, &yMin, &xMax, &yMax);
> >>>>>>>> 
> >>>>>>>> -	text_box* tb = new text_box(ustr, rectf(xMin, yMin, xMax-xMin,
> >>>>>>>> yMax-yMin)); +	std::unique_ptr<text_box> tb =
> >>>>>>>> std::unique_ptr<text_box>(new text_box(ustr, rectf(xMin, yMin,
> >>>>>>>> xMax-xMin,
> >>>>>>>> yMax-yMin)));
> >>>>>>>> 
> >>>>>>>>  	tb->m_data->has_space_after = (word->hasSpaceAfter() == gTrue);
> >>>>>>>>  	
> >>>>>>>>  	tb->m_data->char_bboxes.reserve(word->getLength());
> >>>>>>>> 
> >>>>>>>> @@ -375,10 +369,7 @@ std::vector<text_box*>
> >>>>>>>> page::text_list(rotation_enum
> >>>>>>>> rotate) const
> >>>>>>>> 
> >>>>>>>>  	    tb->m_data->char_bboxes.push_back(rectf(xMin, yMin, xMax-
> >> 
> >> xMin,
> >> 
> >>>>>>>>  	    yMax-yMin));
> >>>>>>>>  	
> >>>>>>>>  	}
> >>>>>>>> 
> >>>>>>>> -	if (output_list.size() > 0)
> >>>>>>>> -	    output_list.back()->m_data->next_text_box = tb;
> >>>>>>>> -
> >>>>>>>> -	output_list.push_back(tb);
> >>>>>>>> +	output_list.push_back(std::move(tb));
> >>>>>>>> 
> >>>>>>>>      }
> >>>>>>>>      delete word_list;
> >>>>>>>>      delete output_dev;
> >>>>>>>> 
> >>>>>>>> diff --git a/cpp/tests/poppler-dump.cpp
> >>>>>>>> b/cpp/tests/poppler-dump.cpp
> >>>>>>>> index 09b180d..d26a0ef 100644
> >>>>>>>> --- a/cpp/tests/poppler-dump.cpp
> >>>>>>>> +++ b/cpp/tests/poppler-dump.cpp
> >>>>>>>> @@ -333,7 +333,7 @@ static void print_page_text_list(poppler::page
> >>>>>>>> *p)
> >>>>>>>> 
> >>>>>>>>          std::cout << std::endl;
> >>>>>>>>          return;
> >>>>>>>>      
> >>>>>>>>      }
> >>>>>>>> 
> >>>>>>>> -    std::vector<poppler::text_box*> text_list =
> >>>>>>>> p->text_list(poppler::rotate_0); +
> >>>>>>>> std::vector<std::unique_ptr<poppler::text_box>> text_list =
> >>>>>>>> p->text_list(poppler::rotate_0);
> >>>>>>>> 
> >>>>>>>>      std::cout << "---" << std::endl;
> >>>>>>>>      for (size_t i = 0; i < text_list.size(); i ++) {
> >>>>>>>> 
> >>>>>>>> @@ -343,7 +343,6 @@ static void print_page_text_list(poppler::page
> >>>>>>>> *p)
> >>>>>>>> 
> >>>>>>>>          std::cout << "( x=" << bbox.x() << " y=" << bbox.y() << "
> >>>>>>>>          w="
> >>>>>>>>          <<
> >>>>>>>> 
> >>>>>>>> bbox.width() << " h=" << bbox.height() << " )";
> >>>>>>>> 
> >>>>>>>>          std::cout << std::endl;
> >>>>>>>> 
> >>>>>>>> -        delete text_list[i];
> >>>>>>>> 
> >>>>>>>>      }
> >>>>>>>>      std::cout << "---" << std::endl;
> >>>>>>>>  
> >>>>>>>>  }
> >>>>>>>> 
> >>>>>>>> Adam Reichold wrote:
> >>>>>>>>> Hello mpsuzuki,
> >>>>>>>>> 
> >>>>>>>>> std::shared_ptr was introduced in C++11 and Poppler does use C++11
> >>>>>>>>> only
> >>>>>>>>> since a relatively short time, so I guess there just was no time
> >>>>>>>>> to
> >>>>>>>>> use
> >>>>>>>>> it anywhere.
> >>>>>>>>> 
> >>>>>>>>> In any case it is a pretty heavy-handed solution due to the atomic
> >>>>>>>>> memory accesses on the reference counter which will slow down even
> >>>>>>>>> single-threaded programs which do not need it. Hence I would
> >>>>>>>>> suggest
> >>>>>>>>> starting with std::unique_ptr to get a move-only type.
> >>>>>>>>> 
> >>>>>>>>> Best regards, Adam.
> >>>>>>>>> 
> >>>>>>>>> Am 02.01.2018 um 08:19 schrieb suzuki toshiya:
> >>>>>>>>>> Dear Albert, Adam,
> >>>>>>>>>> 
> >>>>>>>>>> Thank you for the helpful review. I should have more
> >>>>>>>>>> training on C++... Please let me spend to improve the
> >>>>>>>>>> patch.
> >>>>>>>>>> 
> >>>>>>>>>>> (Another easy option albeit with some
> >>>>>>>>>>> overhead to get automatic destruction would be to wrap the
> >>>>>>>>>>> actual
> >>>>>>>>>>> data
> >>>>>>>>>>> inside a std::shared_ptr and hence use reference counting.)
> >>>>>>>>>> 
> >>>>>>>>>> Yeah, from some posts in stackoverflow, I found about
> >>>>>>>>>> shared_ptr... But it seems that current poppler does
> >>>>>>>>>> not use shared_ptr at all. Is this a coding policy?
> >>>>>>>>>> 
> >>>>>>>>>> Regards,
> >>>>>>>>>> mpsuzuki
> >>>>>>>>>> 
> >>>>>>>>>> Adam Reichold wrote:
> >>>>>>>>>>> Hello
> >>>>>>>>>>> 
> >>>>>>>>>>> Am 02.01.2018 um 01:06 schrieb Albert Astals Cid:
> >>>>>>>>>>>> El diumenge, 31 de desembre de 2017, a les 0:45:16 CET, suzuki
> >>>>>>>>>>>> toshiya
> >>>>>>>>>>>> va
> >>>>>>>>>>>> 
> >>>>>>>>>>>> escriure:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> I've tried to implement the suggestion, I attached my current
> >>>>>>>>>>>>> patch.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> As suggested, the most part is just copied from Qt frontend
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>> renamed,
> >>>>>>>>>>>>> except of one point: TextBox.nextWord() looks slightly
> >>>>>>>>>>>>> confusing,
> >>>>>>>>>>>>> because the returned object is a pointer to TextBox. I wrote
> >>>>>>>>>>>>> text_box.next_text_box() and a macro text_box.next_word()
> >>>>>>>>>>>>> which
> >>>>>>>>>>>>> calls next_text_box() internally.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> Another point I want to discuss is the design of the list give
> >>>>>>>>>>>>> by
> >>>>>>>>>>>>> poppler::page::text_list(). In Qt frontend, Page::textList()
> >>>>>>>>>>>>> returns
> >>>>>>>>>>>>> QList<TextBox*>. For similarity, current patch returns
> >>>>>>>>>>>>> std::vector<text_box*> for similarity to Qt frontend.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> But, if we return the vector of pointers, the client should
> >>>>>>>>>>>>> destruct
> >>>>>>>>>>>>> the objects pointed by the vector, before destructing vector
> >>>>>>>>>>>>> itself.
> >>>>>>>>>>>>> Using a vector of text_box (not the pointer but the object
> >>>>>>>>>>>>> itself),
> >>>>>>>>>>>>> like std::vector<text_box>, could be better, because the
> >>>>>>>>>>>>> destructor
> >>>>>>>>>>>>> of the vector would internally call the destructor for
> >>>>>>>>>>>>> text_box
> >>>>>>>>>>>>> object.
> >>>>>>>>>>>>> (Qt has qDeleteAll(), but I think std::vector does not have
> >>>>>>>>>>>>> such).
> >>>>>>>>>>>>> If I'm misunderstanding about C++, please correct.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> First you need to fix the text_box class, you either need to
> >>>>>>>>>>>> make
> >>>>>>>>>>>> it
> >>>>>>>>>>>> uncopiable (and thus making your suggestion not possible) or
> >>>>>>>>>>>> you
> >>>>>>>>>>>> need
> >>>>>>>>>>>> to make the copy and assignment operators not break your class
> >>>>>>>>>>>> since
> >>>>>>>>>>>> at the moment the default operators will just copy the pointer
> >>>>>>>>>>>> and
> >>>>>>>>>>>> everything will break because you will end up with a dangling
> >>>>>>>>>>>> pointer.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> If you go for the second option you'll want the move
> >>>>>>>>>>>> constructor
> >>>>>>>>>>>> and
> >>>>>>>>>>>> move
> >>>>>>>>>>>> operator too.
> >>>>>>>>>>> 
> >>>>>>>>>>> Should the first option also be viable if text_box becomes a
> >>>>>>>>>>> move-only
> >>>>>>>>>>> type, i.e. no copying but move construction and assignment? I
> >>>>>>>>>>> think
> >>>>>>>>>>> std::vector can handle it. (Another easy option albeit with some
> >>>>>>>>>>> overhead to get automatic destruction would be to wrap the
> >>>>>>>>>>> actual
> >>>>>>>>>>> data
> >>>>>>>>>>> inside a std::shared_ptr and hence use reference counting.)
> >>>>>>>>>>> 
> >>>>>>>>>>> Best regards, Adam.
> >>>>>>>>>>> 
> >>>>>>>>>>>> Cheers,
> >>>>>>>>>>>> 
> >>>>>>>>>>>>   Albert
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> mpsuzuki
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> Albert Astals Cid wrote:
> >>>>>>>>>>>>>> El dimecres, 27 de desembre de 2017, a les 12:26:25 CET,
> >>>>>>>>>>>>>> Jeroen
> >>>>>>>>>>>>>> Ooms
> >>>>>>>>>>>>>> va
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> escriure:
> >>>>>>>>>>>>>>> Is there a method in poppler-cpp to extract text from a pdf
> >>>>>>>>>>>>>>> document,
> >>>>>>>>>>>>>>> including the position of each text box? Currently we use
> >>>>>>>>>>>>>>> page->text()
> >>>>>>>>>>>>>>> with page::physical_layout which gives all text per page,
> >>>>>>>>>>>>>>> but I
> >>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>> more detailed information about each text box per page.
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> You want to code the variant of qt5 frontend
> >>>>>>>>>>>>>> Poppler::Page::textList() for
> >>>>>>>>>>>>>> cpp frontend, it shouldn't be that hard getting inspiration
> >>>>>>>>>>>>>> (i.e.
> >>>>>>>>>>>>>> almost-copying) the code, do you have time for it?
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>   Albert
> >>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>>>> poppler mailing list
> >>>>>>>>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>>> poppler mailing list
> >>>>>>>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/poppler
> >>>>>>>>>>>> 
> >>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>> poppler mailing list
> >>>>>>>>>>>> poppler at lists.freedesktop.org
> >>>>>>>>>>>> https://lists.freedesktop.org/mailman/listinfo/poppler
> >> 
> >> _______________________________________________
> >> poppler mailing list
> >> poppler at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/poppler






More information about the poppler mailing list