[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

Mon Mar 16 01:02:15 PDT 2015

https://bugs.freedesktop.org/show_bug.cgi?id=89586

            Bug ID: 89586
           Summary: Drivers/DRI/swrast
           Product: Mesa
           Version: git
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Other
          Assignee: mesa-dev at lists.freedesktop.org
          Reporter: daniel.sebald at ieee.org
        QA Contact: mesa-dev at lists.freedesktop.org

Created attachment 114338
  --> https://bugs.freedesktop.org/attachment.cgi?id=114338&action=edit
Example image with droped pixels during zoom, creating vertical lines in image

When plotting large images (images that typically span in the x-direction
greater than GL_MAX_TEXTURE_SIZE) with an application using Mesa/OpenGL,
several developers with quite varying versions of Mesa were seeing vertical
white lines spaced at GL_MAX_TEXTURE_SIZE.  I assume that Mesa is displaying
images in hunks of GL_MAX_TEXTURE_SIZE x GL_MAX_TEXTURE_SIZE or less, and it
seems that--at least in the X-dimension--that either the first pixel or the
last pixel is being dropped.

I've attached an example gradient image run through a palette called
Screenshot-Mesa-VerticalLines.png.  The various colors should make the vertical
lines easily visible.  This sample image has an input buffer which is 40,000
pixels wide, enough to span a couple GL_MAX_TEXTURE_SIZE (16384) pixels.  This
image as a pixel scaling factor which is quite small, approximately 1/100.  But
that doesn't seem unreasonably small, meaning it's not in the realm where it
should coincide with numerical issues.

After getting to know the Mesa code a bit and compiling from scratch locally
using PKG_CONFIG_PATH, I eventually tracked down the problem to this hunk of
code (which appears at several different locations in the code, based upon the
type of data being worked on) in the file s_zoom.c:

         for (i = 0; i < zoomedWidth; i++) {
            GLint j = unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - span->x;
            assert(j >= 0);
            assert(j < (GLint) span->end);
            COPY_4V(zoomed.array->attribs[VARYING_SLOT_COL0][i], rgba[j]);
         }

When printing out these variable values on the fly when j happens to be less
than zero (one of the assert conditions) when x0+i = x0 (i.e., i=0, or the
"first" pixel in the zoomed image block).  For example,

j=-71, imgX=73, x0=250, i=0, zoom=0.010850
j=-50, imgX=73, x0=428, i=0, zoom=0.010850

But it strikes me that the j < 0 or j > span->end cases cannot be ignored,
i.e., tossed out via the assert(), because these correspond to an i value of
the mapped output image which is visible.  There has to be something there. 
Basically, my thinking is that, approximately, -1/xfactor < j < 0 corresponds
to a valid pixel, it's just that the pixel crosses the memory chunk/block
boundary in the transformation (the code used the word "chunks", so I will use
chunk).  That is, the pixel for j < 0 should come from the previous
GL_MAX_TEXTURE_SIZE/SWRAST_MAX_WIDTH chunk of data.

I'm going to attach two git changesets to address this issue, a minimal
changeset and a more thorough changeset.  The first changeset will be something
just to make the current code work by adding a few lines and show where the
main bug is.  The second changeset has a few extra changes, mostly
straightforward, but when taken as a whole they might seem confusing.  I'll
attach the second changeset as a different post to help organization.

So I'll explain how I went about this and my thought process for making those
changes in the changeset.  First, let's verify this formula is correct:

            GLint j = unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - span->x;

I will refer to 'j' as the "chunk" unzoomed x.  Elsewhere I'll use the OpenGL
nomenclature of 'n' as the pixel array unzoomed x.  By the OpenGL definition
when x0 + i = imgX the value of j (and n, because it is the first block) should
be exactly zero.  That condition is the scenario of looking at the first
unzoomed pixel.  I will print out the value as follows:

            GLint j = unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - span->x;
if (x0 + i == imgX)
  fprintf(stderr, "x0+i = imgX: j=%d\n", j);

Here is the result:

x0+i = imgX: j=0

That's a good result, and it suggests that those values of j < 0 aren't some
bogus offset or some numerical problem.

Now, let's verify that j < 0 is a valid value, corresponding to a pixel by
testing whether it correctly fits the OpenGL standard for zooming a pixel array
listed here:

https://www.opengl.org/sdk/docs/man2/xhtml/glPixelZoom.xml

  xr + n * xfactor <= xz < xr + (n+1) * xfactor  (1)
  yr + m * yfactor <= yz < yr + (m+1) * xfactor  (2)

Rather than list the commands for printing, I'll just state what I'm printing
from this point forward.  I will run the same app program and this time print
out the value of j when j < 0, the zoomed pixel value, and the formula shown in
the link above:

j=-71:  spanX=16457, imgX=73, spanX-imgX=16384
          (j+spanX-imgX)*ZoomX <= x0+i-imgX < (j+spanX-imgX+1)
                  176.996048    <=    177    <    177.006897
j=-50:  spanX=32841, imgX=73, spanX-imgX=32768
          (j+spanX-imgX)*ZoomX <= x0+i-imgX < (j+spanX-imgX+1)
                  354.990295    <=    355    <    355.001160

OK, that looks correct to me:  spanX-imgX is a factor of SWRAST_MAX_WIDTH
(check), so we are effectly adding a factor of SWRAST_MAX_WIDTH to j (check),
I've made sure to subtract imgX for all values because xr is treated as the
origin of the scaling formula (check) and when we apply the glPixelZoom
formulas the center of the pixel in the zoomed image lies right between the
range limits (check).  So I conclude that those are valid pixels, and when j <
0 the algorithm needs to effectively refer to the previous chunk of data to get
the data value to be displayed in the zoomed pixel array.

There's at least one bug then.

After trying various ideas and not being happy with the cumbersome programming
I was producing (e.g., calling the routine twice to pick up overlapping, etc.),
I found the solution is actually rather straightforward.  All that is needed is
computing the bounds for the span-X for-loop more accurately.  The proper
formula for computing the span limits involves ceiling, and the current code is
doing something akin to floor, so the loop limits are off by one, hence the
dropped pixel.  To see that ceiling is correct, I've included an illustration
in glpixelzoom_illustration.png in a couple cases where the zoom rectangle
falls between pixel centers and when it falls on pixel centers.  Remember that
the definition states that pixels centers on the bottom or left edge qualify
for inclusion.  (If there is a subtlety here I'm missing, please respond.  I'm
sure developers have more experience with this than I have.)

There is another advantage of exactly computing the bounds: those sanity checks
within all the loops are no longer necessary.  Because the zoom/unzoom
operations are essentially linear, or let's say monotonic (still sufficient),
if it is known the end-point indeces are a valid transformation, then all the
indeces inbetween are valid transforms.  In the patch, I've removed all the
assert(j >= 0), assert(j< Xmax).  Works fine.

After making that change above to readjuste the x-index limits, the vertical
lines are gone.  See the attached figure
Screenshot-Mesa-VerticalLinesFixed.png.  The right-most edge of the image
appears to no-longer be short of the black vertical boundary as well.  Now it
could be that the border lines aren't always exactly correct (if so I'm not
concerned with that right now, might be simply the application not programming
OpenGL correctly).

Notice that the white artifact along the bottom border is still present.  The
misalignment on the right edge always seems to be none or too wide.  The
misalignment on the bottom edge always seems to be none or too short.  I think
that has to do with the fact that the x-dimension is using xfactor > 0 and the
y-dimension is using xfactor < 0.

I've tried different input image sizes, 0 < xfactor < 1 (large number of
x-pixels), xfactor > 1 (few xpixels), -1 < yfactor < 0, yfactor < -1.  So that
has run the algorithm through its paces and I've seen no problems.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150316/ea337b32/attachment-0001.html>