[Mesa-dev] [PATCH RFC] intel/tools: new intel_sanitize_gpu tool

Wed Feb 7 08:33:13 UTC 2018

HI Scott,

 I like the idea of making this a pre-loadable .so as it lets it get reused for anything using the kernel interface using GEM's. However, there are a few bits, that if addressed, would make this perfect IMO:

  1. It needs a way of being able to label the GEMs (brw_bufmgr.c provides this). The current code below will just tell us which ioctl had a bad thing happen to a GEM BO, but it won't tell us which GEM easily. Admittedly a debug session will tell us the handle of the GEM, but then it is quite icky to figure what the GEM BO was for (OpenGL BO, scratch BO, etc.)

 2. In an ideal world, the .so will work perfectly with other pre-loadable .so's that intercept ioctl as well (for example aub or the batchbuffer logger I made); I freely admit that I am a touch fuzzy on the LD_PRELOAD rules, but it looks like one cannot use this tool at the same time as any other tool that tries to intercept ioctl's (though to be fair, those tools as well don't work with others either I think).

However, if one moves away from requiring at as an .so, but instead use it as something to allocate (and string-label) GEM BO's, then it could work fine; the interface would be something like:

uint32_t //allocate with padding
intel_allocate_gem(int fd, uint64_t size, const char *gem_label);

uint32_t //delete gem created with intel_allocate_gem()
intel_delete_gem_bo(int fd, uint32_t gem_handle);

void //needed when GEM BO's are reused for different purposes
intel_change_gem_label(int fd, uint32_t gem_handle, const char *new_gem_label);

bool //returns true if gem padding is good OR if gem was not created with intel_allocate_gem()
intel_gem_padding_good(int fd, uint32_t gem_handle);

void
intel_execbuffer2(int fd, struct drm_i915_gem_execbuffer2 *p); 

void
intel_execbuffer2_wr(int fd, struct drm_i915_gem_execbuffer2 *p);

The implementation would have a map keyed by the pair (fd, gem_handle) with value of size of GEM and padding size. If the debug flag for checking is down, then all the lookups and necessary locking is skipped so the thing would have (almost) no overhead when the option was not active.

One question that pops into my mind: is it possible to use brw_bufmgr of i965 in anv? If so, then we get reuse with anv for free.

-Kevin

-----Original Message-----
From: Phillips, Scott D 
Sent: Wednesday, February 7, 2018 2:35 AM
To: mesa-dev at lists.freedesktop.org; Rogovin, Kevin <kevin.rogovin at intel.com>
Subject: [PATCH RFC] intel/tools: new intel_sanitize_gpu tool

From: Kevin Rogovin <kevin.rogovin at intel.com>

Adds a new debug tool to pad each GEM BO allocated with (weak) pseudo-random noise values which are then checked after each batchbuffer dispatch to the kernel. This can be quite valuable to find diffucult to track down heisenberg style bugs.

[scott.d.phillips at intel.com: split to separate tool]
---
Hi Kevin, here's another take on your out-of-bounds write detection code, packaged as a separate tool instead of in the driver code. Doing it this way could make it easier to use across i965 and anv (although USERPTR is needed for anv support). The build bits of this are less than half baked at the moment. What do you think?

 src/intel/tools/intel_sanitize_gpu.c   | 306 +++++++++++++++++++++++++++++++++
 src/intel/tools/meson.build            |  10 ++
 src/intel/tools/run_intel_sanitize_gpu |   5 +
 3 files changed, 321 insertions(+)
 create mode 100644 src/intel/tools/intel_sanitize_gpu.c
 create mode 100755 src/intel/tools/run_intel_sanitize_gpu

diff --git a/src/intel/tools/intel_sanitize_gpu.c b/src/intel/tools/intel_sanitize_gpu.c
new file mode 100644
index 00000000000..d74492bb02e
--- /dev/null
+++ b/src/intel/tools/intel_sanitize_gpu.c
@@ -0,0 +1,306 @@
+/*
+ * Copyright © 2015-2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person 
+obtaining a
+ * copy of this software and associated documentation files (the 
+"Software"),
+ * to deal in the Software without restriction, including without 
+limitation
+ * the rights to use, copy, modify, merge, publish, distribute, 
+sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom 
+the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the 
+next
+ * paragraph) shall be included in all copies or substantial portions 
+of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
+EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
+MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
+SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR 
+OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
+ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
+DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#undef _FILE_OFFSET_BITS /* prevent #define open open64 */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdarg.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/mman.h>
+#include <sys/sysmacros.h>
+#include <dlfcn.h>
+#include <i915_drm.h>
+
+#include "util/hash_table.h"
+#include "util/set.h"
+
+#define INTEL_LOG_TAG "INTEL-SANITIZE-GPU"
+#include "common/intel_log.h"
+#include "common/gen_clflush.h"
+
+static int (*libc_open)(const char *pathname, int flags, mode_t mode); 
+static int (*libc_close)(int fd); static int (*libc_ioctl)(int fd, 
+unsigned long request, void *argp); static int (*libc_fcntl)(int fd, 
+int cmd, int param);
+
+#define DRM_MAJOR 226
+
+/* TODO: we want to make sure that the padding forces
+ * the BO to take another page on the (PP)GTT; 4KB
+ * may or may not be the page size for the BO. Indeed,
+ * depending on GPU, kernel version and GEM size, the
+ * page size can be one of 4KB, 64KB or 2M.
+ */
+#define PADDING_SIZE 4096
+
+static struct set *drm_fds = NULL;
+static struct hash_table *bo_sizes = NULL;
+
+#define IS_DRM_FD(fd) (_mesa_set_search(drm_fds,                               \
+                                        (void*)(uintptr_t)(fd)) != NULL)
+#define BO_SIZE(handle) ({                                                     \
+   struct hash_entry *e = _mesa_hash_table_search(bo_sizes,                    \
+                                                  (void*)(uintptr_t)(handle)); \
+   e ? (uintptr_t)e->data : 0;                                                 \
+})
+
+/* Our goal is not to have noise good enough for cryto,
+ * but instead values that are unique-ish enough that
+ * it is incredibly unlikely that a buffer overwrite
+ * will produce the exact same values.
+ */
+static uint8_t
+next_noise_value(uint8_t prev_noise)
+{
+   uint32_t v = prev_noise;
+   return (v * 103u + 227u) & 0xFF;
+}
+
+static void
+fill_noise_buffer(uint8_t *dst, uint8_t start, uint32_t length) {
+   for(uint32_t i = 0; i < length; ++i) {
+      dst[i] = start;
+      start = next_noise_value(start);
+   }
+}
+
+static bool
+padding_is_good(int fd, uint32_t handle) {
+   struct drm_i915_gem_mmap mmap_arg = {
+      .handle = handle,
+      .offset = BO_SIZE(handle),
+      .size = PADDING_SIZE,
+      .flags = 0,
+   };
+
+   if (!mmap_arg.offset) {
+      intel_logd("Unknown buffer handle %d.", handle);
+      return false;
+   }
+
+   uint8_t *mapped;
+   int ret;
+   uint8_t expected_value;
+
+   ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP, &mmap_arg);
+   if (ret != 0) {
+      intel_logd("Unable to map buffer %d for pad checking.", handle);
+      return false;
+   }
+
+   mapped = (uint8_t*) (uintptr_t) mmap_arg.addr_ptr;
+   /* bah-humbug, we need to see the latest contents and
+    * if the bo is not cache coherent we likely need to
+    * invalidate the cache lines to get it.
+    */
+   gen_invalidate_range(mapped, PADDING_SIZE);
+
+   expected_value = handle & 0xFF;
+   for (uint32_t i = 0; i < PADDING_SIZE; ++i) {
+      if (expected_value != mapped[i]) {
+         munmap(mapped, PADDING_SIZE);
+         return false;
+      }
+      expected_value = next_noise_value(expected_value);
+   }
+   munmap(mapped, PADDING_SIZE);
+
+   return true;
+}
+
+static int
+create_with_padding(int fd, struct drm_i915_gem_create *create) {
+   create->size += PADDING_SIZE;
+   int ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_CREATE, create);
+   create->size -= PADDING_SIZE;
+
+   if (ret != 0)
+      return ret;
+
+   uint8_t *noise_values;
+   struct drm_i915_gem_mmap mmap_arg = {
+      .handle = create->handle,
+      .offset = create->size,
+      .size = PADDING_SIZE,
+      .flags = 0,
+   };
+
+   ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_MMAP, &mmap_arg);
+   if (ret != 0)
+      return 0;
+
+   noise_values = (uint8_t*) (uintptr_t) mmap_arg.addr_ptr;
+   fill_noise_buffer(noise_values, create->handle & 0xFF,
+                     PADDING_SIZE);
+   munmap(noise_values, PADDING_SIZE);
+
+   _mesa_hash_table_insert(bo_sizes, (void*)(uintptr_t)create->handle,
+                           (void*)(uintptr_t)create->size);
+
+   return 0;
+}
+
+static int
+exec_and_check_padding(int fd, unsigned long request,
+                       struct drm_i915_gem_execbuffer2 *exec) {
+   int ret = libc_ioctl(fd, request, exec);
+   if (ret != 0)
+      return ret;
+
+   struct drm_i915_gem_exec_object2 *objects =
+      (void*)(uintptr_t)exec->buffers_ptr;
+   uint32_t batch_bo = exec->flags & I915_EXEC_BATCH_FIRST ? objects[0].handle :
+      objects[exec->buffer_count - 1].handle;
+
+   struct drm_i915_gem_wait wait = {
+      .bo_handle = batch_bo,
+      .timeout_ns = -1,
+   };
+   ret = libc_ioctl(fd, DRM_IOCTL_I915_GEM_WAIT, &wait);
+   if (ret != 0)
+      return ret;
+
+   bool detected_out_of_bounds_write = false;
+
+   for (int i = 0; i < exec->buffer_count; i++) {
+      uint32_t handle = objects[i].handle;
+
+      if (!padding_is_good(fd, handle)) {
+         detected_out_of_bounds_write = true;
+         intel_loge("Detected buffer out-of-bounds write in bo %d", handle);
+      }
+   }
+
+   if (unlikely(detected_out_of_bounds_write)) {
+      abort();
+   }
+
+   return 0;
+}
+
+__attribute__ ((visibility ("default"))) int open(const char *path, int 
+flags, ...) {
+   va_list args;
+   mode_t mode;
+
+   va_start(args, flags);
+   mode = va_arg(args, int);
+   va_end(args);
+
+   int fd = libc_open(path, flags, mode);
+
+   if (strcmp(path, "/dev/dri/renderD128") == 0)
+      _mesa_set_add(drm_fds, (void*)(uintptr_t)fd);
+
+   return fd;
+}
+
+__attribute__ ((visibility ("default"), alias ("open"))) int 
+open64(const char *path, int flags, ...);
+
+__attribute__ ((visibility ("default"))) int close(int fd) {
+   struct set_entry *entry = _mesa_set_search(drm_fds, (void*)(uintptr_t)fd);
+   if (entry)
+      _mesa_set_remove(drm_fds, entry);
+
+   return libc_close(fd);
+}
+
+__attribute__ ((visibility ("default"))) int fcntl(int fd, int cmd, 
+...) {
+   va_list args;
+   int param;
+
+   va_start(args, cmd);
+   param = va_arg(args, int);
+   va_end(args);
+
+   int res = libc_fcntl(fd, cmd, param);
+
+   if (IS_DRM_FD(fd) && cmd == F_DUPFD_CLOEXEC)
+      _mesa_set_add(drm_fds, (void*)(uintptr_t)fd);
+
+   return res;
+}
+
+__attribute__ ((visibility ("default"))) int ioctl(int fd, unsigned 
+long request, ...) {
+   va_list args;
+   void *argp;
+   struct stat buf;
+
+   va_start(args, request);
+   argp = va_arg(args, void *);
+   va_end(args);
+
+   if (_IOC_TYPE(request) == DRM_IOCTL_BASE &&
+       !IS_DRM_FD(fd) && fstat(fd, &buf) == 0 &&
+       (buf.st_mode & S_IFMT) == S_IFCHR && major(buf.st_rdev) == DRM_MAJOR) {
+      intel_loge("missed drm fd %d", fd);
+      _mesa_set_add(drm_fds, (void*)(uintptr_t)fd);
+   }
+
+   if (IS_DRM_FD(fd)) {
+      switch (request) {
+      case DRM_IOCTL_I915_GEM_CREATE:
+         return create_with_padding(fd, (struct 
+ drm_i915_gem_create*)argp);
+
+      case DRM_IOCTL_I915_GEM_EXECBUFFER2:
+      case DRM_IOCTL_I915_GEM_EXECBUFFER2_WR:
+         return exec_and_check_padding(fd, request,
+                                       (struct 
+ drm_i915_gem_execbuffer2*)argp);
+
+      default:
+         break;
+      }
+   }
+   return libc_ioctl(fd, request, argp); }
+
+static void __attribute__ ((constructor))
+init(void)
+{
+   drm_fds = _mesa_set_create(NULL, _mesa_hash_pointer,
+                              _mesa_key_pointer_equal);
+   bo_sizes = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+                                      _mesa_key_pointer_equal);
+   libc_open = dlsym(RTLD_NEXT, "open");
+   libc_close = dlsym(RTLD_NEXT, "close");
+   libc_fcntl = dlsym(RTLD_NEXT, "fcntl");
+   libc_ioctl = dlsym(RTLD_NEXT, "ioctl"); }
diff --git a/src/intel/tools/meson.build b/src/intel/tools/meson.build index 7890eeca1dc..92efc72d54a 100644
--- a/src/intel/tools/meson.build
+++ b/src/intel/tools/meson.build
@@ -39,3 +39,13 @@ aubinator_error_decode = executable(
   c_args : [c_vis_args, no_override_init_args],
   build_by_default : false,
 )
+
+intel_sanitize_gpu_mod = shared_module(
+  'intel_sanitize_gpu',
+  files('intel_sanitize_gpu.c'),
+  dependencies : [dep_dl],
+  include_directories : [inc_common, inc_intel, inc_drm_uapi],
+  link_with : [libintel_common, libmesa_util],
+  c_args : [c_vis_args, no_override_init_args],
+  build_by_default : false,
+)
diff --git a/src/intel/tools/run_intel_sanitize_gpu b/src/intel/tools/run_intel_sanitize_gpu
new file mode 100755
index 00000000000..dc1be130b4d
--- /dev/null
+++ b/src/intel/tools/run_intel_sanitize_gpu
@@ -0,0 +1,5 @@
+#!/bin/bash
+# -*- mode: sh -*-
+
+LD_PRELOAD=${PWD}/intel_sanitize_gpu.so${LD_PPRELOAD:+:${LD_PRELOAD}} \
+	  exec ./run $@
--
2.14.3