[Bug 98602] Data races when rendering from multiple threads

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Nov 5 19:12:25 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=98602

            Bug ID: 98602
           Summary: Data races when rendering from multiple threads
           Product: Mesa
           Version: 13.0
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Drivers/DRI/i965
          Assignee: intel-3d-bugs at lists.freedesktop.org
          Reporter: sgunderson at bigfoot.com
        QA Contact: intel-3d-bugs at lists.freedesktop.org

Hi,

I have an application that renders from multiple threads; they use separate
contexts, but the contexts share data. I believe I stay clear of illegal
behavior (such as rendering from one texture in thread A while rendering into
the same texture in thread B), but still, the application would crash or get
spurious GL errors. Helgrind confirms that there are indeed races within Mesa.

I've already filed patches for two of those races (in Mesa core), but there are
some left that are too much into the details of i965 for me to understand. I
haven't actually seen the last ones turn into crashes, but it'd be nice to have
them fixed nevertheless. I don't know if they stem from the same root issue or
not, so I'm filing only one bug.

The first is a race between rendering and setting a fence:

==14794== Possible data race during read of size 8 at 0x34552520 by thread #1
==14794== Locks held: none
==14794==    at 0x1F1752F3: brw_emit_surface_state (brw_wm_surface_state.c:162)
==14794==    by 0x1F176EFB: brw_update_texture_surface
(brw_wm_surface_state.c:629)
==14794==    by 0x1F1770D7: update_stage_texture_surfaces
(brw_wm_surface_state.c:1258)
==14794==    by 0x1F1771DB: brw_update_texture_surfaces
(brw_wm_surface_state.c:1289)
==14794==    by 0x1F16C5C8: check_and_emit_atom (brw_state_upload.c:763)
==14794==    by 0x1F16C5C8: brw_upload_pipeline_state (brw_state_upload.c:876)
==14794==    by 0x1F16C5C8: brw_upload_render_state (brw_state_upload.c:898)
==14794==    by 0x1F14E9F8: brw_try_draw_prims (brw_draw.c:584)
==14794==    by 0x1F14E9F8: brw_draw_prims (brw_draw.c:675)
==14794==    by 0x1EF24A79: vbo_draw_arrays (vbo_exec_array.c:467)
==14794==    by 0x60FBC35: movit::EffectChain::execute_phase(movit::Phase*,
bool, std::set<int, std::less<int>, std::allocator<int> >*,
std::map<movit::Phase*, unsigned int, std::less<movit::Phase*>,
std::allocator<std::pair<movit::Phase* const, unsigned int> > >*,
std::set<movit::Phase*, std::less<movit::Phase*>, std::allocator<movit::Phase*>
>*) (effect_chain.cpp:1956)
==14794==    by 0x60FC37B: movit::EffectChain::render_to_fbo(unsigned int,
unsigned int, unsigned int) (effect_chain.cpp:1785)
==14794==    by 0x12EBA7: render_to_screen (effect_chain.h:346)
==14794==    by 0x12EBA7: GLWidget::paintGL() (glwidget.cpp:109)
==14794==    by 0x4070C53: QGLWidget::glDraw() (in
/usr/lib/x86_64-linux-gnu/libQt5OpenGL.so.5.7.1)
==14794==    by 0x40705FC: QGLWidget::paintEvent(QPaintEvent*) (in
/usr/lib/x86_64-linux-gnu/libQt5OpenGL.so.5.7.1)
==14794==
==14794== This conflicts with a previous write of size 8 by thread #20
==14794== Locks held: 3, at addresses 0x3E7BE0 0x1ACC42A8 0x2DE3E538
==14794==    at 0x1F6CF490: drm_intel_update_buffer_offsets2
(intel_bufmgr_gem.c:2254)
==14794==    by 0x1F6CF490: do_exec2 (intel_bufmgr_gem.c:2411)
==14794==    by 0x1F6D1839: drm_intel_gem_bo_context_exec
(intel_bufmgr_gem.c:2454)
==14794==    by 0x1F184B9C: do_flush_locked (intel_batchbuffer.c:359)
==14794==    by 0x1F184B9C: _intel_batchbuffer_flush.part.2
(intel_batchbuffer.c:422)
==14794==    by 0x1EEC89C8: _mesa_FenceSync (syncobj.c:295)
==14794==    by 0x19E1E9: locked_glFenceSync (ref_counted_gl_sync.h:27)
==14794==    by 0x19E1E9: RefCountedGLsync (ref_counted_gl_sync.h:20)
==14794==    by 0x19E1E9: QuickSyncEncoderImpl::end_frame(long, long,
std::vector<RefCountedFrame, std::allocator<RefCountedFrame> > const&)
(quicksync_encoder.cpp:1899)
==14794==    by 0x19E92B: QuickSyncEncoder::end_frame(long, long,
std::vector<RefCountedFrame, std::allocator<RefCountedFrame> > const&)
(quicksync_encoder.cpp:2183)
==14794==    by 0x1A3766: VideoEncoder::end_frame(long, long,
std::vector<RefCountedFrame, std::allocator<RefCountedFrame> > const&)
(video_encoder.cpp:133)
==14794==    by 0x16E0FD: Mixer::render_one_frame(long) (mixer.cpp:819)
==14794==  Address 0x34552520 is 48 bytes inside a block of size 256 alloc'd
==14794==    at 0x4C2DFE5: calloc (vg_replace_malloc.c:711)
==14794==    by 0x1F6D0188: drm_intel_gem_bo_alloc_internal
(intel_bufmgr_gem.c:805)
==14794==    by 0x1F18C66D: miptree_create (intel_mipmap_tree.c:715)
==14794==    by 0x1F18BD19: intel_miptree_create (intel_mipmap_tree.c:739)
==14794==    by 0x1F1950F7: intel_miptree_create_for_teximage
(intel_tex_image.c:88)
==14794==    by 0x1F1940DC: intel_alloc_texture_image_buffer (intel_tex.c:95)
==14794==    by 0x1F194DF5: intelTexImage (intel_tex_image.c:119)
==14794==    by 0x1EEDE3D6: teximage (teximage.c:3066)
==14794==    by 0x1EEDF1CF: _mesa_TexImage2D (teximage.c:3105)
==14794==    by 0x61078BC: movit::ResourcePool::create_2d_texture(int, int,
int) (resource_pool.cpp:379)
==14794==    by 0x60FBDC9: movit::EffectChain::execute_phase(movit::Phase*,
bool, std::set<int, std::less<int>, std::allocator<int> >*,
std::map<movit::Phase*, unsigned int, std::less<movit::Phase*>,
std::allocator<std::pair<movit::Phase* const, unsigned int> > >*,
std::set<movit::Phase*, std::less<movit::Phase*>, std::allocator<movit::Phase*>
>*) (effect_chain.cpp:1881)
==14794==    by 0x60FC37B: movit::EffectChain::render_to_fbo(unsigned int,
unsigned int, unsigned int) (effect_chain.cpp:1785)
==14794==  Block was alloc'd by thread #20

Here's a race between drawing and uploading a texture:

==14794== Possible data race during write of size 1 at 0x3314FF69 by thread #1
==14794== Locks held: none
==14794==    at 0x1F6CE387: do_bo_emit_reloc (intel_bufmgr_gem.c:1984)
==14794==    by 0x1F6CE62C: drm_intel_gem_bo_emit_reloc
(intel_bufmgr_gem.c:2066)
==14794==    by 0x1F17539E: brw_emit_surface_state (brw_wm_surface_state.c:169)
==14794==    by 0x1F176EFB: brw_update_texture_surface
(brw_wm_surface_state.c:629)
==14794==    by 0x1F1770D7: update_stage_texture_surfaces
(brw_wm_surface_state.c:1258)
==14794==    by 0x1F1771DB: brw_update_texture_surfaces
(brw_wm_surface_state.c:1289)
==14794==    by 0x1F16C5C8: check_and_emit_atom (brw_state_upload.c:763)
==14794==    by 0x1F16C5C8: brw_upload_pipeline_state (brw_state_upload.c:876)
==14794==    by 0x1F16C5C8: brw_upload_render_state (brw_state_upload.c:898)
==14794==    by 0x1F14E9F8: brw_try_draw_prims (brw_draw.c:584)
==14794==    by 0x1F14E9F8: brw_draw_prims (brw_draw.c:675)
==14794==    by 0x1EF24A79: vbo_draw_arrays (vbo_exec_array.c:467)
==14794==    by 0x60FBC35: movit::EffectChain::execute_phase(movit::Phase*,
bool, std::set<int, std::less<int>, std::allocator<int> >*,
std::map<movit::Phase*, unsigned int, std::less<movit::Phase*>,
std::allocator<std::pair<movit::Phase* const, unsigned int> > >*,
std::set<movit::Phase*, std::less<movit::Phase*>, std::allocator<movit::Phase*>
>*) (effect_chain.cpp:1956)
==14794==    by 0x60FC37B: movit::EffectChain::render_to_fbo(unsigned int,
unsigned int, unsigned int) (effect_chain.cpp:1785)
==14794==    by 0x12EBA7: render_to_screen (effect_chain.h:346)
==14794==    by 0x12EBA7: GLWidget::paintGL() (glwidget.cpp:109)
==14794==
==14794== This conflicts with a previous write of size 1 by thread #20
==14794== Locks held: none
==14794==    at 0x1F6CE387: do_bo_emit_reloc (intel_bufmgr_gem.c:1984)
==14794==    by 0x1F6CE62C: drm_intel_gem_bo_emit_reloc
(intel_bufmgr_gem.c:2066)
==14794==    by 0x1F17539E: brw_emit_surface_state (brw_wm_surface_state.c:169)
==14794==    by 0x1F176EFB: brw_update_texture_surface
(brw_wm_surface_state.c:629)
==14794==    by 0x1F1770D7: update_stage_texture_surfaces
(brw_wm_surface_state.c:1258)
==14794==    by 0x1F1771DB: brw_update_texture_surfaces
(brw_wm_surface_state.c:1289)
==14794==    by 0x1F16C5C8: check_and_emit_atom (brw_state_upload.c:763)
==14794==    by 0x1F16C5C8: brw_upload_pipeline_state (brw_state_upload.c:876)
==14794==    by 0x1F16C5C8: brw_upload_render_state (brw_state_upload.c:898)
==14794==    by 0x1F14E9F8: brw_try_draw_prims (brw_draw.c:584)
==14794==    by 0x1F14E9F8: brw_draw_prims (brw_draw.c:675)
==14794==  Address 0x3314ff69 is 233 bytes inside a block of size 256 alloc'd
==14794==    at 0x4C2DFE5: calloc (vg_replace_malloc.c:711)
==14794==    by 0x1F6D0188: drm_intel_gem_bo_alloc_internal
(intel_bufmgr_gem.c:805)
==14794==    by 0x1F18C66D: miptree_create (intel_mipmap_tree.c:715)
==14794==    by 0x1F18BD19: intel_miptree_create (intel_mipmap_tree.c:739)
==14794==    by 0x1F1950F7: intel_miptree_create_for_teximage
(intel_tex_image.c:88)
==14794==    by 0x1F1940DC: intel_alloc_texture_image_buffer (intel_tex.c:95)
==14794==    by 0x1F194DF5: intelTexImage (intel_tex_image.c:119)
==14794==    by 0x1EEDE3D6: teximage (teximage.c:3066)
==14794==    by 0x1EEDF1CF: _mesa_TexImage2D (teximage.c:3105)
==14794==    by 0x166B3E: operator() (mixer.cpp:458)
==14794==    by 0x166B3E: std::_Function_handler<void (),
Mixer::bm_frame(unsigned int, unsigned short, bmusb::FrameAllocator::Frame,
unsigned long, bmusb::VideoFormat, bmusb::FrameAllocator::Frame, unsigned long,
bmusb::AudioFormat)::{lambda()#1}>::_M_invoke(std::_Any_data const&)
(functional:1740)
==14794==    by 0x16F824: operator() (functional:2136)
==14794==    by 0x16F824: Mixer::thread_func() (mixer.cpp:592)
==14794==    by 0xA95590E: ??? (in
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22)
==14794==  Block was alloc'd by thread #20

And here's a race between rendering to an FBO and rendering to the screen (FBO
0):

==14794== ----------------------------------------------------------------
==14794==
==14794== Possible data race during write of size 4 at 0x2B206718 by thread #20
==14794== Locks held: none
==14794==    at 0x1F19624A: intel_update_max_level (intel_tex_validate.c:55)
==14794==    by 0x1F19624A: intel_finalize_mipmap_tree
(intel_tex_validate.c:88)
==14794==    by 0x1F196583: brw_validate_textures (intel_tex_validate.c:199)
==14794==    by 0x1F14E46B: brw_try_draw_prims (brw_draw.c:448)
==14794==    by 0x1F14E46B: brw_draw_prims (brw_draw.c:675)
==14794==    by 0x1EF24A79: vbo_draw_arrays (vbo_exec_array.c:467)
==14794==    by 0x60FBC35: movit::EffectChain::execute_phase(movit::Phase*,
bool, std::set<int, std::less<int>, std::allocator<int> >*,
std::map<movit::Phase*, unsigned int, std::less<movit::Phase*>,
std::allocator<std::pair<movit::Phase* const, unsigned int> > >*,
std::set<movit::Phase*, std::less<movit::Phase*>, std::allocator<movit::Phase*>
>*) (effect_chain.cpp:1956)
==14794==    by 0x60FC37B: movit::EffectChain::render_to_fbo(unsigned int,
unsigned int, unsigned int) (effect_chain.cpp:1785)
==14794==    by 0x16E037: Mixer::render_one_frame(long) (mixer.cpp:805)
==14794==    by 0x16F88F: Mixer::thread_func() (mixer.cpp:598)
==14794==    by 0xA95590E: ??? (in
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22)
==14794==    by 0x4C31D06: mythread_wrapper (hg_intercepts.c:389)
==14794==    by 0x72BC463: start_thread (pthread_create.c:333)
==14794==    by 0xB2209DE: clone (clone.S:105)
==14794==
==14794== This conflicts with a previous write of size 4 by thread #1
==14794== Locks held: none
==14794==    at 0x1F19624A: intel_update_max_level (intel_tex_validate.c:55)
==14794==    by 0x1F19624A: intel_finalize_mipmap_tree
(intel_tex_validate.c:88)
==14794==    by 0x1F196583: brw_validate_textures (intel_tex_validate.c:199)
==14794==    by 0x1F14E46B: brw_try_draw_prims (brw_draw.c:448)
==14794==    by 0x1F14E46B: brw_draw_prims (brw_draw.c:675)
==14794==    by 0x1EF24A79: vbo_draw_arrays (vbo_exec_array.c:467)
==14794==    by 0x60FBC35: movit::EffectChain::execute_phase(movit::Phase*,
bool, std::set<int, std::less<int>, std::allocator<int> >*,
std::map<movit::Phase*, unsigned int, std::less<movit::Phase*>,
std::allocator<std::pair<movit::Phase* const, unsigned int> > >*,
std::set<movit::Phase*, std::less<movit::Phase*>, std::allocator<movit::Phase*>
>*) (effect_chain.cpp:1956)
==14794==    by 0x60FC37B: movit::EffectChain::render_to_fbo(unsigned int,
unsigned int, unsigned int) (effect_chain.cpp:1785)
==14794==    by 0x12EBA7: render_to_screen (effect_chain.h:346)
==14794==    by 0x12EBA7: GLWidget::paintGL() (glwidget.cpp:109)
==14794==    by 0x4070C53: QGLWidget::glDraw() (in
/usr/lib/x86_64-linux-gnu/libQt5OpenGL.so.5.7.1)
==14794==  Address 0x2b206718 is 1,048 bytes inside a block of size 1,088
alloc'd
==14794==    at 0x4C2DFE5: calloc (vg_replace_malloc.c:711)
==14794==    by 0x1F194291: intelNewTextureObject (intel_tex.c:35)
==14794==    by 0x1EEE3A81: create_textures (texobj.c:1227)
==14794==    by 0x173504: PBOFrameAllocator::PBOFrameAllocator(unsigned long,
unsigned int, unsigned int, unsigned long, unsigned int, unsigned int, unsigned
int) (pbo_frame_allocator.cpp:38)
==14794==    by 0x16A11B: Mixer::configure_card(unsigned int,
bmusb::CaptureInterface*, bool) (mixer.cpp:273)
==14794==    by 0x16BD4E: Mixer::Mixer(QSurfaceFormat const&, unsigned int)
(mixer.cpp:179)
==14794==    by 0x12E52B: operator() (glwidget.cpp:54)
==14794==    by 0x12E52B: _M_invoke<> (functional:1400)
==14794==    by 0x12E52B: operator() (functional:1389)
==14794==    by 0x12E52B: void
std::__once_call_impl<std::_Bind_simple<GLWidget::initializeGL()::{lambda()#1}
()> >() (mutex:587)
==14794==    by 0x72C3778: __pthread_once_slow (pthread_once.c:116)
==14794==    by 0x12EEE8: __gthread_once (gthr-default.h:699)
==14794==    by 0x12EEE8: call_once<GLWidget::initializeGL()::<lambda()> >
(mutex:619)
==14794==    by 0x12EEE8: GLWidget::initializeGL() (glwidget.cpp:58)
==14794==    by 0x407067C: QGLWidget::glInit() (in
/usr/lib/x86_64-linux-gnu/libQt5OpenGL.so.5.7.1)
==14794==    by 0x40762DB: QGLWidget::resizeEvent(QResizeEvent*) (in
/usr/lib/x86_64-linux-gnu/libQt5OpenGL.so.5.7.1)
==14794==    by 0x4FDE62D: QWidget::event(QEvent*) (in
/usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5.7.1)
==14794==  Block was alloc'd by thread #1

There are more, but they seem similar, and possibly harmless (like
drm_intel_gem_bo_busy() racing against itself to set what's basically just a
cached flag).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20161105/80e0bf36/attachment-0001.html>


More information about the intel-3d-bugs mailing list