[Mesa-dev] [PATCH 00/10] gallium: resequencer layer
Rob Clark
robdclark at gmail.com
Tue Jun 14 16:07:40 UTC 2016
bleh, seems like max-cc's is still too low on mesa-dev, and some of
the patches didn't get through. You can also find them here:
https://github.com/freedreno/mesa/commits/wip-rsq
BR,
-R
On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark <robdclark at gmail.com> wrote:
> From: Rob Clark <robclark at freedesktop.org>
>
> So, I know there were a couple concerns voiced over the idea of
> re-ordering rendering in a gallium shim pipe driver layer. For
> me, the main concern was whether the overhead of an extra layer,
> queueing and replaying state updates, draws, etc, would be
> prohibitive. So I implemented it enough that I could do some
> benchmarking ;-)
>
> The first 9 patches are just some general API cleanups, which I
> found to be convenient (since the resequencer layer is generating
> most of the state handling with python + mako, so the cleanups to
> improve consistency help minimize the state which required special
> handling). But regardless of the outcome of the resequencer
> layer, I think these patches make sense on their own.
>
> (Note: auto-generating some of the other wrapper layers might be
> an interesting future cleanup.. at least it should be trivial
> for noop ;-))
>
> As far as overhead, I've been benchmarking (most glmark2 + stk +
> gfxbench), and in the current state (without actually having the
> dependency tracking implemented) it doesn't seem to cause more
> than a couple percent overhead. From here on out, the remaining
> overhead added to implement the dependency tracking and re-
> ordering would be the same as the additional overhead required
> to implement it in the driver backend.
>
> And a couple percent overhead is small compared to the expected
> gains for games which benefit.. ie. 8MiB for 1080p rgb frame,
> avoiding copying that from tile to memory and back once or twice
> quickly dwarfs an extra copy of some 10's of kb of state.. and
> even more so for (for ex.) f32f32f32f32 intermediate buffers.
>
> Queries are still missing, but I expect what would be required
> to implement it is the same as the logic that would be needed in
> the driver backend otherwise.
>
> Basically, the only concern I have, compared to the approach of
> implementing the dependency tracking in each driver backend is
> pipe_constant_buffer::user_buffer. Currently both freedreno and
> vc4 what non-UBO constant buffers to be emitted in cmdstream.
> In the adreno case, it looks like a3xx/a4xx should also support
> the non-user_buffer case, although in fact this appears to be
> broken (at least on a4xx) and I've never seen blob driver use
> this. At the moment I'm doing a hack in freedreno to map the
> backing fd_bo and then memcpy it into cmdstream. Which is a
> bit silly (since it is a write-combine buffer I'm copying from).
> But in glmark I had trouble even measuring the overhead of this
> extra copy. Although possibly I need to find something to
> measure which emits more non-UBO constant state.
>
> btw, if someone has some requests for benchmarks to try (provided
> they are available for arm/linux) I'd be happy to try some other
> things.
>
> The plus side of doing this in a separate layer is that we only
> implement the dependency tracking and resource shadowing once,
> instead of both in vc4 and freedreno (and who knows, maybe
> someday someone gets around to writing a lima gallium driver).
> Plus, I envision this to be something that mesa/st wraps the
> pipe_screen with if driconf tells it to, and pscreen->rsq_funcs
> is populated (we at least need a callback to know if resource
> is still busy). This way we can turn it on for games/apps that
> are known to benefit, and leave it off with zero additional
> overhead for better written things (or rather, things written
> with tilers in mind).
>
>
> Rob Clark (10):
> gallium: cleanup set_tess_state
> gallium: make shader_buffers const
> gallium: make constant_buffer const
> gallium: make image_view const
> gallium: change end_query() to return boolean
> gallium/util: add util_copy_index_buffer() helper
> gallium/util: add util_copy_shader_buffer() helper
> gallium/util: add util_copy_vertex_buffer helper
> gallium/util: make util_copy_framebuffer_state(src=NULL) work
> RFC: gallium: add resequencer driver (INCOMPLETE)
>
> configure.ac | 1 +
> src/gallium/auxiliary/util/u_framebuffer.c | 37 +-
> src/gallium/auxiliary/util/u_helpers.c | 15 -
> src/gallium/auxiliary/util/u_helpers.h | 3 -
> src/gallium/auxiliary/util/u_inlines.h | 49 ++
> src/gallium/drivers/ddebug/dd_context.c | 15 +-
> src/gallium/drivers/freedreno/freedreno_query.c | 2 +-
> src/gallium/drivers/freedreno/freedreno_state.c | 13 +-
> src/gallium/drivers/i915/i915_query.c | 2 +-
> src/gallium/drivers/i915/i915_state.c | 8 +-
> src/gallium/drivers/ilo/ilo_query.c | 2 +-
> src/gallium/drivers/ilo/ilo_state.c | 14 +-
> src/gallium/drivers/llvmpipe/lp_query.c | 2 +-
> src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +-
> src/gallium/drivers/llvmpipe/lp_state_vertex.c | 6 +-
> src/gallium/drivers/noop/noop_pipe.c | 2 +-
> src/gallium/drivers/noop/noop_state.c | 2 +-
> src/gallium/drivers/nouveau/nv30/nv30_query.c | 2 +-
> src/gallium/drivers/nouveau/nv30/nv30_state.c | 13 +-
> src/gallium/drivers/nouveau/nv50/nv50_query.c | 2 +-
> src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +-
> src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 25 +-
> src/gallium/drivers/r300/r300_query.c | 4 +-
> src/gallium/drivers/r300/r300_state.c | 10 +-
> src/gallium/drivers/r600/evergreen_state.c | 7 +-
> src/gallium/drivers/r600/r600_state_common.c | 7 +-
> src/gallium/drivers/radeon/r600_query.c | 2 +-
> src/gallium/drivers/radeonsi/si_descriptors.c | 16 +-
> src/gallium/drivers/radeonsi/si_state.c | 13 +-
> src/gallium/drivers/radeonsi/si_state.h | 3 +-
> src/gallium/drivers/rbug/rbug_context.c | 6 +-
> src/gallium/drivers/resequencer/.gitignore | 2 +
> src/gallium/drivers/resequencer/Makefile.am | 44 ++
> src/gallium/drivers/resequencer/Makefile.sources | 23 +
> src/gallium/drivers/resequencer/rsq_batch.c | 144 +++++
> src/gallium/drivers/resequencer/rsq_batch.h | 71 +++
> src/gallium/drivers/resequencer/rsq_context.c | 457 ++++++++++++++++
> src/gallium/drivers/resequencer/rsq_context.h | 84 +++
> src/gallium/drivers/resequencer/rsq_draw.c | 230 ++++++++
> src/gallium/drivers/resequencer/rsq_draw.h | 40 ++
> src/gallium/drivers/resequencer/rsq_fence.c | 48 ++
> src/gallium/drivers/resequencer/rsq_fence.h | 43 ++
> src/gallium/drivers/resequencer/rsq_public.h | 68 +++
> src/gallium/drivers/resequencer/rsq_query.c | 148 +++++
> src/gallium/drivers/resequencer/rsq_query.h | 32 ++
> src/gallium/drivers/resequencer/rsq_resource.c | 222 ++++++++
> src/gallium/drivers/resequencer/rsq_resource.h | 60 ++
> src/gallium/drivers/resequencer/rsq_screen.c | 186 +++++++
> src/gallium/drivers/resequencer/rsq_screen.h | 50 ++
> src/gallium/drivers/resequencer/rsq_state.py | 607 +++++++++++++++++++++
> .../drivers/resequencer/rsq_state_helpers.h | 219 ++++++++
> src/gallium/drivers/resequencer/rsq_surface.c | 107 ++++
> src/gallium/drivers/resequencer/rsq_surface.h | 72 +++
> src/gallium/drivers/softpipe/sp_query.c | 2 +-
> src/gallium/drivers/softpipe/sp_state_image.c | 10 +-
> src/gallium/drivers/softpipe/sp_state_shader.c | 2 +-
> src/gallium/drivers/softpipe/sp_state_vertex.c | 6 +-
> src/gallium/drivers/svga/svga_pipe_constants.c | 2 +-
> src/gallium/drivers/svga/svga_pipe_query.c | 2 +-
> src/gallium/drivers/svga/svga_pipe_vertex.c | 2 +-
> src/gallium/drivers/swr/swr_query.cpp | 2 +-
> src/gallium/drivers/swr/swr_state.cpp | 9 +-
> src/gallium/drivers/trace/tr_context.c | 15 +-
> src/gallium/drivers/vc4/vc4_query.c | 2 +-
> src/gallium/drivers/vc4/vc4_state.c | 13 +-
> src/gallium/drivers/virgl/virgl_context.c | 10 +-
> src/gallium/drivers/virgl/virgl_query.c | 4 +-
> src/gallium/include/pipe/p_context.h | 12 +-
> src/gallium/include/pipe/p_state.h | 8 +
> src/mesa/state_tracker/st_atom_tess.c | 13 +-
> 70 files changed, 3148 insertions(+), 210 deletions(-)
> create mode 100644 src/gallium/drivers/resequencer/.gitignore
> create mode 100644 src/gallium/drivers/resequencer/Makefile.am
> create mode 100644 src/gallium/drivers/resequencer/Makefile.sources
> create mode 100644 src/gallium/drivers/resequencer/rsq_batch.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_batch.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_context.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_context.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_draw.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_draw.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_fence.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_fence.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_public.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_query.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_query.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_resource.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_resource.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_screen.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_screen.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_state.py
> create mode 100644 src/gallium/drivers/resequencer/rsq_state_helpers.h
> create mode 100644 src/gallium/drivers/resequencer/rsq_surface.c
> create mode 100644 src/gallium/drivers/resequencer/rsq_surface.h
>
> --
> 2.5.5
>
More information about the mesa-dev
mailing list