[amd-gfx] [PATCH 1/3] drm/amdgpu: add disable_cu parameter

StDenis, Tom Tom.StDenis at amd.com
Fri Jun 17 13:31:04 UTC 2016


I wonder if some sort of self-test like the ring/ib tests we do is a good idea.  Either from the UMD or KMD.


In this specific case though are you working around a CU that results in a GPU lockup?  Or does it just not respond correctly?


Tom


________________________________
From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> on behalf of Nicolai Hähnle <nhaehnle at gmail.com>
Sent: Friday, June 17, 2016 09:17
To: amd-gfx at lists.freedesktop.org
Cc: Haehnle, Nicolai
Subject: [amd-gfx] [PATCH 1/3] drm/amdgpu: add disable_cu parameter

From: Nicolai Hähnle <nicolai.haehnle at amd.com>

This parameter will allow disabling individual CUs on module load, e.g.
amdgpu.disable_cu=2.0.3,2.0.4 to disable CUs 3 and 4 of SE2.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  4 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 44 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  2 ++
 4 files changed, 51 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 01c36b8..2d35e11 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -87,6 +87,7 @@ extern int amdgpu_sched_hw_submission;
 extern int amdgpu_powerplay;
 extern unsigned amdgpu_pcie_gen_cap;
 extern unsigned amdgpu_pcie_lane_cap;
+extern char *amdgpu_disable_cu;

 #define AMDGPU_WAIT_IDLE_TIMEOUT_IN_MS          3000
 #define AMDGPU_MAX_USEC_TIMEOUT                 100000  /* 100 ms */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f888c01..235f732 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -84,6 +84,7 @@ int amdgpu_sched_hw_submission = 2;
 int amdgpu_powerplay = -1;
 unsigned amdgpu_pcie_gen_cap = 0;
 unsigned amdgpu_pcie_lane_cap = 0;
+char *amdgpu_disable_cu = NULL;

 MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
 module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
@@ -168,6 +169,9 @@ module_param_named(pcie_gen_cap, amdgpu_pcie_gen_cap, uint, 0444);
 MODULE_PARM_DESC(pcie_lane_cap, "PCIE Lane Caps (0: autodetect (default))");
 module_param_named(pcie_lane_cap, amdgpu_pcie_lane_cap, uint, 0444);

+MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");
+module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
+
 static const struct pci_device_id pciidlist[] = {
 #ifdef CONFIG_DRM_AMDGPU_CIK
         /* Kaveri */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 9f95da4..a074edd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -70,3 +70,47 @@ void amdgpu_gfx_scratch_free(struct amdgpu_device *adev, uint32_t reg)
                 }
         }
 }
+
+/**
+ * amdgpu_gfx_parse_disable_cu - Parse the disable_cu module parameter
+ *
+ * @mask: array in which the per-shader array disable masks will be stored
+ * @max_se: number of SEs
+ * @max_sh: number of SHs
+ *
+ * The bitmask of CUs to be disabled in the shader array determined by se and
+ * sh is stored in mask[se * max_sh + sh].
+ */
+void amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se, unsigned max_sh)
+{
+       unsigned se, sh, cu;
+       const char *p;
+
+       memset(mask, 0, sizeof(*mask) * max_se * max_sh);
+
+       if (!amdgpu_disable_cu || !*amdgpu_disable_cu)
+               return;
+
+       p = amdgpu_disable_cu;
+       for (;;) {
+               char *next;
+               int ret = sscanf(p, "%u.%u.%u", &se, &sh, &cu);
+               if (ret < 3) {
+                       DRM_ERROR("amdgpu: could not parse disable_cu\n");
+                       return;
+               }
+
+               if (se < max_se && sh < max_sh && cu < 16) {
+                       DRM_INFO("amdgpu: disabling CU %u.%u.%u\n", se, sh, cu);
+                       mask[se * max_sh + sh] |= 1u << cu;
+               } else {
+                       DRM_ERROR("amdgpu: disable_cu %u.%u.%u is out of range\n",
+                                 se, sh, cu);
+               }
+
+               next = strchr(p, ',');
+               if (!next)
+                       break;
+               p = next + 1;
+       }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index dc06cbd..51321e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -27,4 +27,6 @@
 int amdgpu_gfx_scratch_get(struct amdgpu_device *adev, uint32_t *reg);
 void amdgpu_gfx_scratch_free(struct amdgpu_device *adev, uint32_t reg);

+unsigned amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se, unsigned max_sh);
+
 #endif
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20160617/6651344c/attachment-0001.html>


More information about the amd-gfx mailing list