[Mesa-dev] [PATCH 00/25] Asynchronous flushes and ddebug core rewrite
Nicolai Hähnle
nhaehnle at gmail.com
Sun Oct 22 19:07:43 UTC 2017
Hi all,
I was chasing an elusive bug that went away with GALLIUM_THREAD=0, so I wanted
to use ddebug with Gallium threads. That required some fixes to how radeonsi
compiles shaders. However, with that fixed, ddebug *also* made the bug go away.
This series does a lot of things, but the overarching goal is to rewrite
ddebug in a way that can be used with Gallium threading in a minimally
intrusive way to reduce the chance of Heisenbugs.
Patches 1-4: Cleanup some time handling and add a
util_queue_fence_wait_timeout. We'll use that later since (queue) fences will
be embedded inside (Gallium) fences, and for ddebug hang detection, we really
need waits with timeouts.
Patches 5-14: Add asynchronous flushing and fine-grained fences to Gallium and
radeonsi:
1. pipe_context::flush has always provided a stronger guarantee than what
may be expected by glFlush() [0]. Specifically, pipe_context::flush
establishes a happens-before relationship where all operations in all
contexts of the screen that are called after flush() returns will happen
after all operations that happened in the flushed context before the
flush().
glFlush() doesn't actually make this stronger guarantee as I understand the
spec, at least the OpenGL one. I suspect that the stronger guarantee may be
implied by GLX and other WSI, but I'm not sure. And certainly, I wouldn't be
surprised if there's software out there that assumes it.
Anyway, this series adds a new PIPE_FLUSH_ASYNC flag which can be used to
tell the driver that a weaker guarantee suffices.
This flag is handled by the Gallium threaded context: it will execute the
flush asynchronously, in the separate driver thread. The same behavior is
enabled for PIPE_FLUSH_DEFERRED flushes. Both require a new special protocol
for Gallium threading enabled drivers (currently only radeonsi).
2. ddebug hang detection needs a way of adding fences to individual draw calls,
in order to pin-point exactly which draw call causes a hang.
This was previously done by inserting clear_buffer() calls, relying on the
precise implementation of those by the driver. It so happened that radeonsi
could decide to send those to SDMA, which stopped them from working. In
general, this approach was a terrible abuse of the interface and layering
violation.
This series adds PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE flags which can be used
to specifically request a fine-grained, per-draw call fence. Waiting on
those fences will mostly work like waiting on a deferred fence, but when a
timeout is used (especially timeout == 0), the fence can be signaled earlier
based on a value written by the GPU inside the command stream.
Patches 15-23: Rewrite the ddebug core to always use pipelined mode, and
streamline the GALLIUM_DDEBUG parsing. See the detailed comment on patch #20,
which is the main chunk of code.
Also adds the option of treating transfers as draw calls.
Patches 24 & 25: Turn on Gallium threading for debug contexts with radeonsi.
Please review!
Thanks,
Nicolai
[0] This is true for Radeon. But since AFAIK amdgpu is the only kernel driver
with a scheduler, I suspect the same is true for other drivers. If your
driver *doesn't* provide this stronger guarantee, please speak up!
--
include/c11/threads.h | 11 -
include/c11/threads_posix.h | 39 +-
include/c11/threads_win32.h | 37 +-
src/egl/drivers/dri2/egl_dri2.c | 24 +-
src/gallium/auxiliary/Makefile.sources | 3 -
src/gallium/auxiliary/gallivm/lp_bld_init.c | 2 +-
src/gallium/auxiliary/hud/hud_cpu.c | 2 +-
src/gallium/auxiliary/hud/hud_cpufreq.c | 2 +-
src/gallium/auxiliary/hud/hud_diskstat.c | 2 +-
src/gallium/auxiliary/hud/hud_driver_query.c | 2 +-
src/gallium/auxiliary/hud/hud_fps.c | 2 +-
src/gallium/auxiliary/hud/hud_nic.c | 2 +-
src/gallium/auxiliary/hud/hud_sensors_temp.c | 2 +-
src/gallium/auxiliary/meson.build | 3 -
.../auxiliary/pipebuffer/pb_bufmgr_cache.c | 1 -
.../auxiliary/pipebuffer/pb_bufmgr_debug.c | 1 -
.../auxiliary/pipebuffer/pb_bufmgr_slab.c | 1 -
src/gallium/auxiliary/pipebuffer/pb_cache.c | 2 +-
src/gallium/auxiliary/util/u_debug.c | 19 +-
src/gallium/auxiliary/util/u_dump.h | 9 +
src/gallium/auxiliary/util/u_dump_defines.c | 53 +
src/gallium/auxiliary/util/u_dump_state.c | 16 +-
.../auxiliary/util/u_threaded_context.c | 212 +++-
.../auxiliary/util/u_threaded_context.h | 58 +-
.../util/u_threaded_context_calls.h | 2 +
src/gallium/auxiliary/util/u_time.h | 150 ---
src/gallium/docs/source/context.rst | 23 +
src/gallium/drivers/ddebug/dd_context.c | 130 +-
src/gallium/drivers/ddebug/dd_draw.c | 1049 ++++++++++------
src/gallium/drivers/ddebug/dd_pipe.h | 93 +-
src/gallium/drivers/ddebug/dd_screen.c | 168 ++-
src/gallium/drivers/ddebug/dd_util.h | 32 +-
.../drivers/etnaviv/etnaviv_query_sw.c | 2 +-
src/gallium/drivers/etnaviv/etnaviv_screen.c | 2 +-
.../drivers/freedreno/freedreno_query_sw.c | 2 +-
.../drivers/freedreno/freedreno_screen.c | 2 +-
src/gallium/drivers/llvmpipe/lp_query.c | 2 +-
src/gallium/drivers/llvmpipe/lp_rast.c | 2 +-
src/gallium/drivers/llvmpipe/lp_screen.c | 2 +-
src/gallium/drivers/llvmpipe/lp_setup.c | 2 +-
src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +-
.../drivers/llvmpipe/lp_state_setup.c | 2 +-
src/gallium/drivers/nouveau/nouveau_fence.c | 2 +-
src/gallium/drivers/nouveau/nouveau_screen.c | 2 +-
src/gallium/drivers/r300/r300_context.c | 2 +-
src/gallium/drivers/r300/r300_flush.c | 2 +-
src/gallium/drivers/r300/r300_screen.c | 2 +-
src/gallium/drivers/r600/r600_gpu_load.c | 2 +-
src/gallium/drivers/r600/r600_pipe.c | 2 +-
src/gallium/drivers/r600/r600_pipe_common.c | 2 +-
src/gallium/drivers/r600/r600_query.c | 2 +-
src/gallium/drivers/r600/r600_texture.c | 2 +-
src/gallium/drivers/r600/sb/sb_core.cpp | 2 +-
src/gallium/drivers/radeon/r600_gpu_load.c | 2 +-
.../drivers/radeon/r600_pipe_common.c | 269 +---
src/gallium/drivers/radeon/r600_query.c | 2 +-
src/gallium/drivers/radeon/r600_texture.c | 2 +-
.../drivers/radeonsi/Makefile.sources | 1 +
src/gallium/drivers/radeonsi/meson.build | 1 +
src/gallium/drivers/radeonsi/si_debug.c | 5 +-
src/gallium/drivers/radeonsi/si_fence.c | 482 +++++++
src/gallium/drivers/radeonsi/si_hw_context.c | 3 +
src/gallium/drivers/radeonsi/si_pipe.c | 14 +-
src/gallium/drivers/radeonsi/si_pipe.h | 7 +
src/gallium/drivers/rbug/rbug_core.c | 2 +-
src/gallium/drivers/softpipe/sp_query.c | 2 +-
src/gallium/drivers/softpipe/sp_screen.c | 2 +-
src/gallium/drivers/svga/svga_context.h | 2 +-
src/gallium/drivers/svga/svga_pipe_draw.c | 1 -
src/gallium/drivers/swr/swr_fence.cpp | 2 +-
src/gallium/drivers/swr/swr_query.cpp | 2 +-
src/gallium/drivers/trace/tr_dump.c | 2 +-
src/gallium/drivers/virgl/virgl_screen.c | 2 +-
src/gallium/include/pipe/p_context.h | 19 +-
src/gallium/include/pipe/p_defines.h | 4 +
.../state_trackers/wgl/stw_framebuffer.c | 2 +-
src/gallium/tests/unit/pipe_barrier_test.c | 3 +-
src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 2 +-
src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 3 +-
.../winsys/radeon/drm/radeon_drm_bo.c | 2 +-
.../winsys/radeon/drm/radeon_drm_cs.c | 2 +-
.../winsys/virgl/drm/virgl_drm_winsys.c | 2 +-
.../winsys/virgl/vtest/virgl_vtest_winsys.c | 2 +-
src/util/Makefile.sources | 2 +
src/util/futex.h | 9 +-
src/util/meson.build | 2 +
src/{gallium/auxiliary/os => util}/os_time.c | 19 +-
src/{gallium/auxiliary/os => util}/os_time.h | 23 +-
src/util/simple_mtx.h | 2 +-
src/util/u_queue.c | 77 +-
src/util/u_queue.h | 51 +-
91 files changed, 1990 insertions(+), 1235 deletions(-)
More information about the mesa-dev
mailing list