[Poppler-bugs] [Bug 62266] [PATCH] try to detect line breaks in the PDF and insert them in raw mode for pdftotext

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Mar 25 14:16:53 PDT 2013


https://bugs.freedesktop.org/show_bug.cgi?id=62266

--- Comment #10 from Albert Astals Cid <aacid at kde.org> ---
(In reply to comment #9)
> > it is just an assumption that if two characters are separated enough one from the other, there is a space in the middle
> 
> It is more than that. As I said:
> 
> > For example, the current code inserts a new line whenever the next word is detected to not be in the same line as the current word
> 
> The raw text isn't just having spaces added, but it is also getting new
> lines added whenever the vertical space between the current word and the
> next word exceeds the `maxIntraLineDelta` constant.
> 
> My patch is a very small extension of this sort of logic: add an additional
> new line when the vertical space between the current word and next word
> exceeds the `maxLineSpacingDelta` constant.
> 
> I don't think my patch makes any additional assumptions beyond the
> assumptions already made by the code.

It may not, but i don't see the need for your patch (you haven't made a case
for it) and more code means more code I need to maitain for the rest of my
life. In my opinion you are trying to use raworder for something that raworder
is not supposed to do, why are you using raw order instead of the real physical
order?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20130325/f876e92a/attachment.html>


More information about the Poppler-bugs mailing list