[Mesa-dev] [Bug 99817] [softpipe] piglit glsl-fs-tan-1 regression

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Feb 15 23:18:37 UTC 2017


Vinson Lee <vlee at freedesktop.org> changed:

           What    |Removed                     |Added
           Keywords|                            |bisected
                 CC|                            |idr at freedesktop.org

--- Comment #3 from Vinson Lee <vlee at freedesktop.org> ---
e9ffd12827ac11a2d2002a42fa8eb1df847153ba is the first bad commit
commit e9ffd12827ac11a2d2002a42fa8eb1df847153ba
Author: Francisco Jerez <currojerez at riseup.net>
Date:   Sat Jan 21 13:41:08 2017 -0800

    glsl: Rewrite atan2 implementation to fix accuracy and handling of

    This addresses several issues of the current atan2 implementation:

     - Negative zero (and negative denorms which end up getting flushed to
       zero) isn't handled correctly by the current implementation.  The
       reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
       on which side of the branch cut the argument is, which causes us to
       return incorrect results (off by up to 2π) for very small negative

     - There is a serious precision problem for x values of large enough
       magnitude introduced by the floating point division operation being
       implemented as a mul+rcp sequence.  This can lead to the quotient
       getting flushed to zero in some cases introducing an error of over
       8e6 ULP in the result -- Or in the most catastrophic case will
       cause us to return NaN instead of the correct value ±π/2 for y=±∞
       and x very large.  We can fix this easily by scaling down both
       arguments when the absolute value of the denominator goes above
       certain threshold.  The error of this atan2 implementation remains
       below 25 ULP in most of its domain except for a neighborhood of y=0
       where it reaches a maximum error of about 180 ULP.

     - It emits a bunch of instructions including no less than three
       if-else branches per scalar component that don't seem to get
       optimized out later on.  This implementation uses about 13% less
       instructions on Intel SKL hardware and doesn't emit any control
       flow instructions.

    v2: Fix up argument scaling to take into account the range and
        precision of exotic FP24 hardware.  Flip coordinate system for
        arguments along the vertical line as if they were on the left
        half-plane in order to avoid division by zero which may give
        unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
        some more comments.

    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>

:040000 040000 acb11a161d3a4b78c246efd2d3720e8d66c8772a
89658b9bde1aa03b542528f4cae464516e8db300 M      src
bisect run success

You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170215/f0489ce7/attachment.html>

More information about the mesa-dev mailing list