[Mesa-dev] [PATCH 00/25] Asynchronous flushes and ddebug core rewrite

Nicolai Hähnle nhaehnle at gmail.com
Sun Oct 22 19:07:43 UTC 2017


Hi all,

I was chasing an elusive bug that went away with GALLIUM_THREAD=0, so I wanted
to use ddebug with Gallium threads. That required some fixes to how radeonsi
compiles shaders. However, with that fixed, ddebug *also* made the bug go away.

This series does a lot of things, but the overarching goal is to rewrite
ddebug in a way that can be used with Gallium threading in a minimally
intrusive way to reduce the chance of Heisenbugs.

Patches 1-4: Cleanup some time handling and add a
util_queue_fence_wait_timeout. We'll use that later since (queue) fences will
be embedded inside (Gallium) fences, and for ddebug hang detection, we really
need waits with timeouts.

Patches 5-14: Add asynchronous flushing and fine-grained fences to Gallium and
radeonsi:

1. pipe_context::flush has always provided a stronger guarantee than what
   may be expected by glFlush() [0]. Specifically, pipe_context::flush
   establishes a happens-before relationship where all operations in all
   contexts of the screen that are called after flush() returns will happen
   after all operations that happened in the flushed context before the
   flush().

   glFlush() doesn't actually make this stronger guarantee as I understand the
   spec, at least the OpenGL one. I suspect that the stronger guarantee may be
   implied by GLX and other WSI, but I'm not sure. And certainly, I wouldn't be
   surprised if there's software out there that assumes it.

   Anyway, this series adds a new PIPE_FLUSH_ASYNC flag which can be used to
   tell the driver that a weaker guarantee suffices.

   This flag is handled by the Gallium threaded context: it will execute the
   flush asynchronously, in the separate driver thread. The same behavior is
   enabled for PIPE_FLUSH_DEFERRED flushes. Both require a new special protocol
   for Gallium threading enabled drivers (currently only radeonsi).

2. ddebug hang detection needs a way of adding fences to individual draw calls,
   in order to pin-point exactly which draw call causes a hang.

   This was previously done by inserting clear_buffer() calls, relying on the
   precise implementation of those by the driver. It so happened that radeonsi
   could decide to send those to SDMA, which stopped them from working. In
   general, this approach was a terrible abuse of the interface and layering
   violation.

   This series adds PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE flags which can be used
   to specifically request a fine-grained, per-draw call fence. Waiting on
   those fences will mostly work like waiting on a deferred fence, but when a
   timeout is used (especially timeout == 0), the fence can be signaled earlier
   based on a value written by the GPU inside the command stream.

Patches 15-23: Rewrite the ddebug core to always use pipelined mode, and
streamline the GALLIUM_DDEBUG parsing. See the detailed comment on patch #20,
which is the main chunk of code.

Also adds the option of treating transfers as draw calls.

Patches 24 & 25: Turn on Gallium threading for debug contexts with radeonsi.

Please review!
Thanks,
Nicolai

[0] This is true for Radeon. But since AFAIK amdgpu is the only kernel driver
    with a scheduler, I suspect the same is true for other drivers. If your
    driver *doesn't* provide this stronger guarantee, please speak up!
--
 include/c11/threads.h                        |   11 -
 include/c11/threads_posix.h                  |   39 +-
 include/c11/threads_win32.h                  |   37 +-
 src/egl/drivers/dri2/egl_dri2.c              |   24 +-
 src/gallium/auxiliary/Makefile.sources       |    3 -
 src/gallium/auxiliary/gallivm/lp_bld_init.c  |    2 +-
 src/gallium/auxiliary/hud/hud_cpu.c          |    2 +-
 src/gallium/auxiliary/hud/hud_cpufreq.c      |    2 +-
 src/gallium/auxiliary/hud/hud_diskstat.c     |    2 +-
 src/gallium/auxiliary/hud/hud_driver_query.c |    2 +-
 src/gallium/auxiliary/hud/hud_fps.c          |    2 +-
 src/gallium/auxiliary/hud/hud_nic.c          |    2 +-
 src/gallium/auxiliary/hud/hud_sensors_temp.c |    2 +-
 src/gallium/auxiliary/meson.build            |    3 -
 .../auxiliary/pipebuffer/pb_bufmgr_cache.c   |    1 -
 .../auxiliary/pipebuffer/pb_bufmgr_debug.c   |    1 -
 .../auxiliary/pipebuffer/pb_bufmgr_slab.c    |    1 -
 src/gallium/auxiliary/pipebuffer/pb_cache.c  |    2 +-
 src/gallium/auxiliary/util/u_debug.c         |   19 +-
 src/gallium/auxiliary/util/u_dump.h          |    9 +
 src/gallium/auxiliary/util/u_dump_defines.c  |   53 +
 src/gallium/auxiliary/util/u_dump_state.c    |   16 +-
 .../auxiliary/util/u_threaded_context.c      |  212 +++-
 .../auxiliary/util/u_threaded_context.h      |   58 +-
 .../util/u_threaded_context_calls.h          |    2 +
 src/gallium/auxiliary/util/u_time.h          |  150 ---
 src/gallium/docs/source/context.rst          |   23 +
 src/gallium/drivers/ddebug/dd_context.c      |  130 +-
 src/gallium/drivers/ddebug/dd_draw.c         | 1049 ++++++++++------
 src/gallium/drivers/ddebug/dd_pipe.h         |   93 +-
 src/gallium/drivers/ddebug/dd_screen.c       |  168 ++-
 src/gallium/drivers/ddebug/dd_util.h         |   32 +-
 .../drivers/etnaviv/etnaviv_query_sw.c       |    2 +-
 src/gallium/drivers/etnaviv/etnaviv_screen.c |    2 +-
 .../drivers/freedreno/freedreno_query_sw.c   |    2 +-
 .../drivers/freedreno/freedreno_screen.c     |    2 +-
 src/gallium/drivers/llvmpipe/lp_query.c      |    2 +-
 src/gallium/drivers/llvmpipe/lp_rast.c       |    2 +-
 src/gallium/drivers/llvmpipe/lp_screen.c     |    2 +-
 src/gallium/drivers/llvmpipe/lp_setup.c      |    2 +-
 src/gallium/drivers/llvmpipe/lp_state_fs.c   |    2 +-
 .../drivers/llvmpipe/lp_state_setup.c        |    2 +-
 src/gallium/drivers/nouveau/nouveau_fence.c  |    2 +-
 src/gallium/drivers/nouveau/nouveau_screen.c |    2 +-
 src/gallium/drivers/r300/r300_context.c      |    2 +-
 src/gallium/drivers/r300/r300_flush.c        |    2 +-
 src/gallium/drivers/r300/r300_screen.c       |    2 +-
 src/gallium/drivers/r600/r600_gpu_load.c     |    2 +-
 src/gallium/drivers/r600/r600_pipe.c         |    2 +-
 src/gallium/drivers/r600/r600_pipe_common.c  |    2 +-
 src/gallium/drivers/r600/r600_query.c        |    2 +-
 src/gallium/drivers/r600/r600_texture.c      |    2 +-
 src/gallium/drivers/r600/sb/sb_core.cpp      |    2 +-
 src/gallium/drivers/radeon/r600_gpu_load.c   |    2 +-
 .../drivers/radeon/r600_pipe_common.c        |  269 +---
 src/gallium/drivers/radeon/r600_query.c      |    2 +-
 src/gallium/drivers/radeon/r600_texture.c    |    2 +-
 .../drivers/radeonsi/Makefile.sources        |    1 +
 src/gallium/drivers/radeonsi/meson.build     |    1 +
 src/gallium/drivers/radeonsi/si_debug.c      |    5 +-
 src/gallium/drivers/radeonsi/si_fence.c      |  482 +++++++
 src/gallium/drivers/radeonsi/si_hw_context.c |    3 +
 src/gallium/drivers/radeonsi/si_pipe.c       |   14 +-
 src/gallium/drivers/radeonsi/si_pipe.h       |    7 +
 src/gallium/drivers/rbug/rbug_core.c         |    2 +-
 src/gallium/drivers/softpipe/sp_query.c      |    2 +-
 src/gallium/drivers/softpipe/sp_screen.c     |    2 +-
 src/gallium/drivers/svga/svga_context.h      |    2 +-
 src/gallium/drivers/svga/svga_pipe_draw.c    |    1 -
 src/gallium/drivers/swr/swr_fence.cpp        |    2 +-
 src/gallium/drivers/swr/swr_query.cpp        |    2 +-
 src/gallium/drivers/trace/tr_dump.c          |    2 +-
 src/gallium/drivers/virgl/virgl_screen.c     |    2 +-
 src/gallium/include/pipe/p_context.h         |   19 +-
 src/gallium/include/pipe/p_defines.h         |    4 +
 .../state_trackers/wgl/stw_framebuffer.c     |    2 +-
 src/gallium/tests/unit/pipe_barrier_test.c   |    3 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c    |    2 +-
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c    |    3 +-
 .../winsys/radeon/drm/radeon_drm_bo.c        |    2 +-
 .../winsys/radeon/drm/radeon_drm_cs.c        |    2 +-
 .../winsys/virgl/drm/virgl_drm_winsys.c      |    2 +-
 .../winsys/virgl/vtest/virgl_vtest_winsys.c  |    2 +-
 src/util/Makefile.sources                    |    2 +
 src/util/futex.h                             |    9 +-
 src/util/meson.build                         |    2 +
 src/{gallium/auxiliary/os => util}/os_time.c |   19 +-
 src/{gallium/auxiliary/os => util}/os_time.h |   23 +-
 src/util/simple_mtx.h                        |    2 +-
 src/util/u_queue.c                           |   77 +-
 src/util/u_queue.h                           |   51 +-
 91 files changed, 1990 insertions(+), 1235 deletions(-)



More information about the mesa-dev mailing list