[Mesa-dev] [PATCH] anv/gpu_memcpy: CS Stall before a MI memcpy on gen7
Jason Ekstrand
jason at jlekstrand.net
Sat Feb 17 09:42:50 UTC 2018
On Sat, Feb 17, 2018 at 1:21 AM, Kenneth Graunke <kenneth at whitecape.org>
wrote:
> On Saturday, February 17, 2018 1:09:10 AM PST Jason Ekstrand wrote:
> > This fixes a pile of hangs caused by the recent shuffling of resolves
> > and transitions. The particularly problematic case is when you have at
> > least three attachments with load ops of CLEAR, LOAD, CLEAR. In this
> > case, we execute the first CLEAR followed by a MI memcpy to copy the
> > clear values over for the LOAD followed by a second CLEAR. The MI
> > commands cause the first CLEAR to hang which causes us to get stuck on
> > the 3DSTATE_MULTISAMPLE in the second CLEAR.
> >
> > We also add guards for BLORP to fix the same issue. These shouldn't
> > actually do anything right now because the only use of indirect clears
> > in BLORP today is for resolves which are already guarded by a render
> > cache flush and CS stall. However, this will guard us against potential
> > issues in the future.
> >
> > Cc: Kenneth Graunke <kenneth at whitecape.org>
> > Cc: Nanley Chery <nanley.g.chery at intel.com>
> > ---
> > src/intel/vulkan/genX_blorp_exec.c | 10 ++++++++++
> > src/intel/vulkan/genX_gpu_memcpy.c | 22 ++++++++++++++++++++++
> > 2 files changed, 32 insertions(+)
> >
> > diff --git a/src/intel/vulkan/genX_blorp_exec.c
> b/src/intel/vulkan/genX_blorp_exec.c
> > index 04f7675..f956715 100644
> > --- a/src/intel/vulkan/genX_blorp_exec.c
> > +++ b/src/intel/vulkan/genX_blorp_exec.c
> > @@ -200,6 +200,16 @@ genX(blorp_exec)(struct blorp_batch *batch,
> > genX(cmd_buffer_config_l3)(cmd_buffer, cfg);
> > }
> >
> > +#if GEN_GEN == 7
> > + /* The MI_LOAD/STORE_REGISTER_MEM commands which BLORP uses to
> implement
> > + * indirect fast-clear colors can cause GPU hangs if we don't stall
> first.
> > + * See genX(cmd_buffer_mi_memcpy) for more details.
> > + */
> > + assert(params->src.clear_color_addr.buffer == NULL);
> > + if (params->dst.clear_color_addr.buffer)
> > + cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_CS_STALL_BIT;
> > +#endif
> > +
> > genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
> >
> > genX(flush_pipeline_select_3d)(cmd_buffer);
> > diff --git a/src/intel/vulkan/genX_gpu_memcpy.c
> b/src/intel/vulkan/genX_gpu_memcpy.c
> > index f3ada93..1d527fb 100644
> > --- a/src/intel/vulkan/genX_gpu_memcpy.c
> > +++ b/src/intel/vulkan/genX_gpu_memcpy.c
> > @@ -62,6 +62,28 @@ genX(cmd_buffer_mi_memcpy)(struct anv_cmd_buffer
> *cmd_buffer,
> > assert(dst_offset % 4 == 0);
> > assert(src_offset % 4 == 0);
> >
> > +#if GEN_GEN == 7
> > + /* On gen7, the combination of commands used
> here(MI_LOAD_REGISTER_MEM
> > + * and MI_STORE_REGISTER_MEM) can cause GPU hangs if any rendering is
> > + * in-flight when they are issued even if the memory touched is not
> > + * currently active for rendering. The weird bit is that it is not
> the
> > + * MI_LOAD/STORE_REGISTER_MEM commands which hang but rather the
> in-flight
> > + * rendering hangs such that the next stalling command after the
> > + * MI_LOAD/STORE_REGISTER_MEM commands will catch the hang.
> > + *
> > + * It is unclear exactly why this hang occurs. Both MI commands
> come with
> > + * warnings about the 3D pipeline but that doesn't seem to fully
> explain
> > + * it. My (Jason's) best theory is that it has something to do with
> the
> > + * fact that we're using a GPU state register as our temporary and
> that
> > + * something with reading/writing it is causing problems.
> > + *
> > + * In order to work around this issue, we emit a PIPE_CONTROL with
> the
> > + * command streamer stall bit set.
> > + */
> > + cmd_buffer->state.pending_pipe_bits |= ANV_PIPE_CS_STALL_BIT;
> > + genX(cmd_buffer_apply_pipe_flushes)(cmd_buffer);
> > +#endif
> > +
> > for (uint32_t i = 0; i < size; i += 4) {
> > const struct anv_address src_addr =
> > (struct anv_address) { src, src_offset + i};
> >
>
> This seems reasonable enough. I can't say I understand why this is
> necessary,
That makes two of us and, when Nanley gets around to looking at it, I'm
sure it'll make 3. :-)
> but if it fixes the problem...it's probably best to just
> do it. We can always change it later if we find new information.
>
I'm fairly sure that this is more-or-less the correct fix and 10 Jenkins
runs along with 50 (soon to be 100) runs of
dEQP-VK.renderpass.dedicated_allocation.attachment.* agree.
> Acked-by: Kenneth Graunke <kenneth at whitecape.org>
>
Thanks!
> I'll let Nanley review as well.
>
I can wait. I've only been fighting this for a week and a half...
--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180217/396ba255/attachment.html>
More information about the mesa-dev
mailing list