[Intel-gfx] [RFC] drm/i915/gt: reduce context clear batch size to avoid gpu hang (rev2)

rwright at hpe.com rwright at hpe.com
Sat Oct 10 20:11:10 UTC 2020


The first version of this RFC patch caused a build error when - to my
suprise - it was automatically built.  I had presumed an RFC message
would be for comment only, and so I had pasted part of the patch,
thereby breaking whitespace.  In this version, I have directly included
the patch without pasting, so it should apply.  I also
included a drm_dbg message omitted from v1.

For several months, I've been experiencing GPU hangs when  starting
Cinnamon on an HP Pavilion Mini 300-020 if I try to run an upstream
kernel.  I reported this recently in
https://gitlab.freedesktop.org/drm/intel/-/issues/2413 where I have
attached the requested evidence including the state collected from
/sys/class/drm/card0/error and debug output from dmesg.

I got around to running a bisect to find the problem, which indicates:

  [47f8253d2b8947d79fd3196bf96c1959c0f25f20] drm/i915/gen7: Clear all EU/L3 residual contexts

While I'm experienced in several areas of the Linux kernel, I'm really
nothing but an end user of the graphics drivers.  But the nature of that
troublesome commit suggested to me that reducing the batch size used in
the context clear operation might help this relatively low-powered
system to avoid the hang.... and it did!  I simply forced this system to
take the smaller batch length that is already used for non-Haswell
systems.

I'm calling this patch an RFC because this version is quick-and-dirty,
affecting only one file.  If this makes sense, I have a cleaner version
that keys off of a proper quirk, but let's discuss the idea first before
looking at that.   Maybe it doesn't need a new quirk?  Maybe there is
already something distinctive on which the decision could be made?

Signed-off-by: Randy Wright <rwright at hpe.com>
---
 drivers/gpu/drm/i915/gt/gen7_renderclear.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
index d93d85cd3027..96bc09bc41f2 100644
--- a/drivers/gpu/drm/i915/gt/gen7_renderclear.c
+++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
@@ -49,7 +49,11 @@ struct batch_vals {
 static void
 batch_get_defaults(struct drm_i915_private *i915, struct batch_vals *bv)
 {
-	if (IS_HASWELL(i915)) {
+	struct pci_dev *d = i915->drm.pdev;
+	int force_reduced = (d->subsystem_vendor == PCI_VENDOR_ID_HP
+			  && d->subsystem_device == 0x2b38);
+
+	if (IS_HASWELL(i915) && !force_reduced) {
 		bv->max_primitives = 280;
 		bv->max_urb_entries = MAX_URB_ENTRIES;
 		bv->surface_height = 16 * 16;
@@ -60,6 +64,8 @@ batch_get_defaults(struct drm_i915_private *i915, struct batch_vals *bv)
 		bv->surface_height = 16 * 8;
 		bv->surface_width = 32 * 16;
 	}
+	drm_dbg(&i915->drm, "force_reduced=%d max_primitives=%d\n",
+			     force_reduced, bv->max_primitives);
 	bv->cmd_size = bv->max_primitives * 4096;
 	bv->state_size = STATE_SIZE;
 	bv->state_start = bv->cmd_size;
-- 
2.25.1



--
Randy Wright            Usmail: Hewlett Packard Enterprise
Email: rwright at hpe.com          Servers Linux Enablement
Phone: (970) 898-0998           3404 E. Harmony Rd, Mailstop 36
                                Fort Collins, CO 80528-9599 


More information about the Intel-gfx mailing list