[Mesa-dev] [PATCH 00/16] anv: Handle out of memory situations better

Pohjolainen, Topi topi.pohjolainen at gmail.com
Thu Mar 9 07:12:21 UTC 2017


On Thu, Mar 09, 2017 at 08:04:29AM +0100, Iago Toral wrote:
> On Wed, 2017-03-08 at 15:00 -0800, Jason Ekstrand wrote:
> > On Wed, Mar 8, 2017 at 12:24 AM, Iago Toral <itoral at igalia.com>
> > wrote:
> > > On Tue, 2017-03-07 at 10:31 +0200, Pohjolainen, Topi wrote:
> > > > On Tue, Mar 07, 2017 at 08:15:45AM +0100, Iago Toral Quiroga
> > > wrote:
> > > > >
> > > > > There are a number of work-in-progress CTS tests that check OOM
> > > > > handling
> > > > > and raised a number of issues that this series addresses.
> > > > >
> > > > > Particularly noteworthy is the case of command buffer
> > > recording:
> > > > > since the
> > > > > various vkCmd*() functions do not return errors, it is expected
> > > > > that drivers survive until vkEndCommanBuffer() and report any
> > > > > errors
> > > > > at that point. This requires that we are particularly careful
> > > with
> > > > > out of
> > > > > memory scenarios, since we need to make sure that vkCmd*()
> > > commands
> > > > > do not
> > > > > attempt to access memory that we have failed to allocate in a
> > > > > previous step.
> > > > > We achieve this by tracking errors generated during command
> > > buffer
> > > > > recording
> > > > > into the command buffer object and, generally, avoiding to
> > > execute
> > > > > new
> > > > > vkCmd*() functions if the command buffer recording has
> > > generated an
> > > > > error
> > > > > before. In general, I tried to  only guard execution of
> > > vkCmd*()
> > > > > commands that
> > > > > the tests showed needed to be guarded since most of the
> > > vkCmd*()
> > > > > commands are
> > > > > safe to execute in any scenario.
> > > > >
> > > > > This series fixes all the issues raised with the new tests and
> > > a
> > > > > few more that
> > > > > I found by inspection but I am sure we could do better in more
> > > > > places.
> > > > > It is a first step though and the tests are still in
> > > development.
> > > > >
> > > > > Iago Toral Quiroga (16):
> > > > >   blorp: handle out of memory in blorp_emitn()
> > > > >   blorp: handle out of memory without crashing in various batch
> > > > >     emissions
> > > > These two go sort of half way - blorp_exec() still ignores the
> > > return
> > > > values of blorp_emit_vertex_buffers() and
> > > > blorp_emit_vertex_elements().
> > > > Neither does blorp_emit_depth_stencil_state() returning zero make
> > > > blorp_exec() to take any further action. 
> > > 
> > > Right, I focused on avoiding the crashes only because there are a
> > > lot
> > > of other places where we do the same thing. Specifically, the
> > > blorp_emit() macro that we use all over the place, suffers from
> > > this
> > > same issue: it prevents the crash by avoiding the loop body to
> > > execute
> > > if we failed to allocate memory, but it does nothing to report the
> > > allocation failure back to its callers, so I figured it would not
> > > make
> > > sense to do something different for blorp_emitn().
> > > 
> > > I am fine with being more robust here and try to address the
> > > problem
> > > more thoroughly if you think it makes sense, but if we are going to
> > > do
> > > that, then I think we would need to look beyond this macro and do a
> > > lot
> > > more changes. Do you want me to work on that?
> > I'm not sure what you mean.  If blorp_emit calls blorp_emit_dwords
> > which calls anv_batch_emit_dwords which handles OOM by setting
> > anv_batch::status to VK_OUT_OF_DEVICE_MEMORY and returning NULL, then
> > that should take care of almost everything.  We would also need
> > anv_cmd_buffer_alloc_dynamic_state and
> > anv_cmd_buffer_alloc_blorp_binding_table to do the right thing.  I
> > really don't think this will require massive BLORP changes.
> 
> Ok, I see what you mean. I understood that Topi wanted to have
> functions that called blorp_emit macros to handle and return errors to
> their callers explicitly, which would've meant to adds checks every
> time we called any of these macros. If it is only about making sure
> that we check the command buffer status after calling functions that
> can emit batches I agree it should be trivial, I'll do that.

Yeah, I really didn't like the idea of propagating the error but leaving blorp
(and its caller actually) thinking everything was okay wasn't good either.
Jason came up with a nice plan so lets do that :)

> 
> Iago
> 
> > > In any case, even if we agree on doing this, since these patches
> > > put
> > > blorp_emitn() at the same level as blorp_emit() and they solve some
> > > problems, would it be okay to land this series first and tackle
> > > that in
> > > a second series?
> > > 
> > > Iago
> > > 
> > > > >
> > > > >   anv: don't crash if we fail to grow the reloc list
> > > > >   anv: handle out of memory situations in
> > > anv_batch_emit_dwords()
> > > > >   anv: do not try to ref/unref NULL shaders
> > > > >   anv/blorp: return early if we failed to create the shader
> > > binary
> > > > >   anv/cmd_buffer: return a VkResult in
> > > cmd_buffer_setup_attachments
> > > > >   anv/cmd_buffer: track error scenarios during command buffer
> > > > > recording
> > > > >   anv: handle out of memory scenarios in anv_batch_emit_batch()
> > > > >   anv/cmd_buffer: handle out of memory during
> > > vkCmdPushConstants
> > > > >   anv: handle out of memory situations during queue submissions
> > > > >   anv: ensure that we don't ever try to adjust relocations more
> > > > > than
> > > > >     once
> > > > >   anv/cmd_buffer: skip vkCmdDraw*() on broken command buffers
> > > > >   anv/cmd_buffer: skip vkCmdDispatch() on broken command
> > > buffers
> > > > >   anv/cmd_buffer: skip vkCmdExecuteCommands() on broken command
> > > > > buffers
> > > > >   anv/cmd_buffer: handle out of device memory during binding
> > > table
> > > > >     emission
> > > > >
> > > > >  src/intel/blorp/blorp_genX_exec.h  | 25 ++++++++++-----
> > > > >  src/intel/vulkan/anv_batch_chain.c | 45 +++++++++++++++++++---
> > > ----
> > > > > -
> > > > >  src/intel/vulkan/anv_blorp.c       |  3 ++
> > > > >  src/intel/vulkan/anv_cmd_buffer.c  | 21 +++++++++++--
> > > > >  src/intel/vulkan/anv_private.h     | 19 ++++++++++++
> > > > >  src/intel/vulkan/genX_cmd_buffer.c | 63
> > > > > ++++++++++++++++++++++++++++++++------
> > > > >  6 files changed, 142 insertions(+), 34 deletions(-)
> > > > >
> > > 


More information about the mesa-dev mailing list