[Poppler-bugs] [Bug 3188] Pasting tables cells in strange order

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Feb 21 17:01:56 PST 2014


https://bugs.freedesktop.org/show_bug.cgi?id=3188

--- Comment #71 from Brian Ewins <Brian.Ewins at gmail.com> ---
(In reply to comment #70)

(I originally replied on launchpad, which is supposed to copy it through to
here, but it hasn't.)

Carlos: it isn't a regression that lines outside a rectangle formed by the
start and endpoints are included, it's the intent.

Consider selecting in a document with two columns, starting in the 1st column
2/3 down the page, ending in the 2nd column 1/3 down the page. In this case,
the correct selection consists entirely of lines that lie outside the rectangle
formed by the start and endpoints (ie, the bottom 1/3 of the 1st column and the
top 1/3 of the 2nd column).

You get situations like this even for single column text; just choose start and
end points vertically above each other.

The motivation for this patch was that text selection by rectangles is
fundamentally wrong. The correct approach is to reconstruct the reading order
of text; then from two points on the page, find the nearest insertion points
(where an edit cursor would go); swap the insertion points if necessary; then
return the characters between them. The difficulties lie in inferring the
reading order, and determining what 'nearest insertion point' means.

Clicking inside a word, the nearest insertion point is obvious; it's the
nearest character boundary. Click in a blank area, and it's less clear. In
Breuel's algorithms that I used for determining reading order, there is
something that helps here. There, line width is determined by expanding the
line left and right to fit the column it contains. So the line 'box' contains
the initial indent if it is the first line of a paragraph, or the trailing
space in the last line; or the ragged space for left- or right- justified text.

Poppler doesn't have columns as such, but blocks instead, and as I recall the
line boxes are the tight bounding box of the words contained in the line. So we
can try to determine insertion point by looking for the nearest block
(horizontally and vertically), then the nearest line (vertically ONLY, so that
we ignore indents/ragged space), then nearest character (horizontally). I mean
these to be three different comparisons, discarding blocks, line and character
candidates at each stage, not some single distance you sum up. The upshot would
be that clicking in blank areas of a line that lie within its block's bounding
box - or even nearby - will choose that line, not the one above or below.

(It's been ages since I looked at the poppler code, I can't remember if this
heuristic is what the patches do already)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20140222/c7fac4d8/attachment.html>


More information about the Poppler-bugs mailing list