[Mesa-dev] FPE Invalid Operation

Bram Stolk b.stolk at gmail.com
Mon Sep 5 18:32:00 UTC 2016


>If you're able to, make a local build, adding -O0 to cflags, and don't
>strip the debug info. That should show you exactly where the issue
>lies.

I built mesa 12.0.2 using:
~/src/mesa-12.0.2$ CFLAGS=-O0 ./configure --prefix=$HOME --disable-dri3

If I LD_PRELOAD the resulting libGL.so then the crash goes away and I see
correctly reported:
GL_VERSION  3.3 (Core Profile) Mesa 12.0.2

So maybe it is an old bug, making 11.2.0 affected, but not 12.0.2?
So then I proceeded to build the same version that comes with Ubuntu.

When LD_PRELOAD-ing that libary, I get:

Thread 1 "game" received signal SIGFPE, Arithmetic exception.
0x00007fffeecb48e4 in gen7_upload_urb (brw=0x27419b8) at gen7_urb.c:301
301             round(hs_wants * (((double) remaining_space) /
total_wants));
(gdb) where
#0  0x00007fffeecb48e4 in gen7_upload_urb (brw=0x27419b8) at gen7_urb.c:301
#1  0x00007fffeec95c8c in check_and_emit_atom (brw=0x27419b8,
state=0x7ffffffe5500, atom=0x2767860) at brw_state_upload.c:771
#2  0x00007fffeec961ea in brw_upload_pipeline_state (brw=0x27419b8,
pipeline=BRW_RENDER_PIPELINE) at brw_state_upload.c:882
#3  0x00007fffeec962e4 in brw_upload_render_state (brw=0x27419b8) at
brw_state_upload.c:904
#4  0x00007fffeec76362 in brw_try_draw_prims (ctx=0x27419b8,
arrays=0x270d010, prims=0x7ffffffe56c0, nr_prims=1, ib=0x0, min_index=0,
max_index=2, indirect=0x0) at brw_draw.c:560
#5  0x00007fffeec76785 in brw_draw_prims (ctx=0x27419b8,
prims=0x7ffffffe56c0, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001',
min_index=0, max_index=2, unused_tfb_object=0x0, stream=0, indirect=0x0)
    at brw_draw.c:650
#6  0x00007fffeec7fcf5 in brw_draw_rectlist (brw=0x27419b8,
rect=0x7ffffffe57b0, num_instances=1) at brw_meta_fast_clear.c:219
#7  0x00007fffeec80e01 in brw_meta_fast_clear (brw=0x27419b8, fb=0x2722e00,
buffers=34, partial_clear=false) at brw_meta_fast_clear.c:777
#8  0x00007fffeec63e23 in brw_clear (ctx=0x27419b8, mask=34) at
brw_clear.c:244
#9  0x00007fffee842f66 in _mesa_Clear (mask=17664) at main/clear.c:224
#10 0x00007ffff7ba43fa in glClear (mask=17664) at
glapi/glapi_mapi_tmp.h:3068
#11 0x0000000000452d4d in Gfx::TargetClear (r=0.0100228256, g=0.0100228256,
b=0.039681904, a=1, depth=1, stencil=0) at ../graphics/graphics.cpp:588
#12 0x000000000040a739 in Vis::Update (this=0x7ffffffe5d58,
elapsed=1.70000007e-07, numsteps=1, dt=0.00416666688, wld=0x7ffffffe91b0)
at vis.cpp:511
#13 0x0000000000428462 in main (argc=1, argv=0x7fffffffdd78) at main.cpp:106
(gdb) l
296          vs_chunks += vs_additional;
297          remaining_space -= vs_additional;
298          total_wants -= vs_wants;
299
300          unsigned hs_additional = (unsigned)
301             round(hs_wants * (((double) remaining_space) /
total_wants));
302          hs_chunks += hs_additional;
303          remaining_space -= hs_additional;
304          total_wants -= hs_wants;
305
(gdb) print total_wants
$1 = 0
(gdb)

Even though 12.0.2 seems unaffected:
Ubuntu is using 11.2.0, maybe fixing the v11 tree still makes sense?

I'm also puzzled that this division by zero depends on the values I pass to
glClearColor before the glClear command, though.

Bram

PS: Just a side-note: when I tried re-configuring (after building) my
11.2.0 tree, it choked on pyton mako. I had to delete the entire source
tree, and unpack again before I could re-configure.


On Mon, Sep 5, 2016 at 10:19 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:

> On Mon, Sep 5, 2016 at 12:55 PM, Bram Stolk <b.stolk at gmail.com> wrote:
> > Hey,
> >
> > Do you guys test with Floating-Point-Exceptions enabled?
>
> I think most people just test with whatever is the default.
>
> > Because on Intel-Haswell, I can trigger a FPE with a simple glClearColor
> and
> > glClear.
> > I think glClear is doing bad FP math on Haswell.
> >
> > https://software.intel.com/en-us/forums/graphics-driver-bug-
> reporting/topic/681580
>
> Unfortunately you don't have any debug info for your mesa build.
>
> >
> > I recommend doing a:
> >
> > feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );
> >
> > In your test-suite.
>
> I did that but was unable to trigger the error on a i7-920 CPU and an
> NVIDIA GPU (driven by nouveau) in some simple tests. Perhaps a change
> in how HSW reports errors, or could be a HSW- (or gen7+) specific code
> path.
>
> If you're able to, make a local build, adding -O0 to cflags, and don't
> strip the debug info. That should show you exactly where the issue
> lies.
>
> Separately, if you can make a small program that reproduces the issue
> for you (e.g. with glut + epoxy), that would help others test and
> figure out what's going on.
>
>   -ilia
>
> P.S. for reference, this is the (whitespace-damaged) patch I applied
> to piglit for my testing:
>
> diff --git a/tests/util/piglit-framework-gl.h
> b/tests/util/piglit-framework-gl.h
> index 81c1a5e..1510fc1 100644
> --- a/tests/util/piglit-framework-gl.h
> +++ b/tests/util/piglit-framework-gl.h
> @@ -235,6 +235,7 @@ void
>  piglit_gl_test_run(int argc, char *argv[],
>                    const struct piglit_gl_test_config *config);
>
> +#include <fenv.h>
>  #ifdef __cplusplus
>  #  define PIGLIT_EXTERN_C_BEGIN extern "C" {
>  #  define PIGLIT_EXTERN_C_END   }
> @@ -263,6 +264,7 @@ piglit_gl_test_run(int argc, char *argv[],
>                  piglit_disable_error_message_boxes();
>     \
>
>     \
>                  piglit_gl_test_config_init(&config);
>      \
> +               feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW
> );   \
>
>     \
>                  config.init = piglit_init;
>    \
>                  config.display = piglit_display;
>    \
>



-- 
Owner/Director of Game Studio Abraham Stolk Inc.
Vancouver BC, Canada
b.stolk at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160905/40d5d7a8/attachment-0001.html>


More information about the mesa-dev mailing list