[Mesa-dev] [PATCH] anv: Add an option to abort on device loss

Lionel Landwerlin lionel.g.landwerlin at intel.com
Thu May 18 21:25:20 UTC 2017


Cool, thanks!

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>

On 18/05/17 22:10, Jason Ekstrand wrote:
> I just sent a patch to address that.  Thanks!
>
> On Thu, May 18, 2017 at 2:04 PM, Lionel Landwerlin 
> <lionel.g.landwerlin at intel.com <mailto:lionel.g.landwerlin at intel.com>> 
> wrote:
>
>     This looks good, but I wonder whether we're missing a vk_errorf()
>     in anv_QueueSubmit() when we get an error from
>     anv_cmd_buffer_execbuf().
>     In that case it looks like we won't abort.
>
>
>     On 18/05/17 21:51, Jason Ekstrand wrote:
>
>         This is mostly for running in our CI system to prevent dEQP from
>         continuing on to the next test if we get a GPU hang. As it
>         currently
>         stands, dEQP uses the same VkDevice for almost all tests and
>         if one of
>         the tests hangs, we set the anv_device::device_lost flag and
>         report
>         VK_ERROR_DEVICE_LOST for all queue operations from that point
>         forward
>         without sending anything to the GPU.  dEQP will happily
>         continue trying
>         to run tests and reporting failures until it eventually gets
>         crash that
>         forces the test runner to start over.  This circumvents the
>         problem by
>         just aborting the process if we ever get a GPU hang. Since
>         this is not
>         the recommended behavior most of the time, we hide it behind an
>         environment variable.
>
>         Cc: Mark Janes <mark.a.janes at intel.com
>         <mailto:mark.a.janes at intel.com>>
>         ---
>           src/intel/vulkan/anv_util.c | 5 +++++
>           1 file changed, 5 insertions(+)
>
>         diff --git a/src/intel/vulkan/anv_util.c
>         b/src/intel/vulkan/anv_util.c
>         index ba91733..4b916e2 100644
>         --- a/src/intel/vulkan/anv_util.c
>         +++ b/src/intel/vulkan/anv_util.c
>         @@ -30,6 +30,7 @@
>             #include "anv_private.h"
>           #include "vk_enum_to_str.h"
>         +#include "util/debug.h"
>             /** Log an error message.  */
>           void anv_printflike(1, 2)
>         @@ -95,5 +96,9 @@ __vk_errorf(VkResult error, const char
>         *file, int line, const char *format, ...)
>                 fprintf(stderr, "%s:%d: %s\n", file, line, error_str);
>              }
>           +   if (error == VK_ERROR_DEVICE_LOST &&
>         +       env_var_as_boolean("ANV_ABORT_ON_DEVICE_LOSS", false))
>         +      abort();
>         +
>              return error;
>           }
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170518/2dfba907/attachment-0001.html>


More information about the mesa-dev mailing list