[Intel-gfx] [RFC] drm/i915/gt: reduce context clear batch size to avoid gpu hang

rwright at hpe.com rwright at hpe.com
Sun Oct 4 20:36:31 UTC 2020


Hello,

For several months, I've been experiencing GPU hangs when  starting
Cinnamon on an HP Pavilion Mini 300-020 if I try to run an upstream
kernel.  I reported this recently in
https://gitlab.freedesktop.org/drm/intel/-/issues/2413 where I have
attached the requested evidence including the state collected from
/sys/class/drm/card0/error and debug output from dmesg.

I got around to running a bisect to find the problem, which indicates:

  [47f8253d2b8947d79fd3196bf96c1959c0f25f20] drm/i915/gen7: Clear all EU/L3 residual contexts

While I'm experienced in several areas of the Linux kernel, I'm really
nothing but an end user of the graphics drivers.  But the nature of that
troublesome commit suggested to me that reducing the batch size used in
the context clear operation might help this relatively low-powered
system to avoid the hang.... and it did!  I simply forced this system to
take the smaller batch length that is already used for non-Haswell
systems.

I'm calling this patch an RFC because this version is quick-and-dirty,
affecting only one file.  If this makes sense, I have a cleaner version
that keys off of a proper quirk, but let's discuss the idea first before
looking at that.   Maybe it doesn't need a new quirk?  Maybe
there is already something distinctive on which the decision
could be made?

diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
index d93d85cd3027..6d24e266cda2 100644
--- a/drivers/gpu/drm/i915/gt/gen7_renderclear.c
+++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
@@ -49,7 +49,11 @@ struct batch_vals {
 static void
 batch_get_defaults(struct drm_i915_private *i915, struct batch_vals *bv)
 {
-       if (IS_HASWELL(i915)) {
+        struct pci_dev *d = i915->drm.pdev;
+        int force_reduced = (d->subsystem_vendor == PCI_VENDOR_ID_HP
+                         && d->subsystem_device == 0x2b38);
+
+       if (IS_HASWELL(i915) && !force_reduced) {
                bv->max_primitives = 280;
                bv->max_urb_entries = MAX_URB_ENTRIES;
                bv->surface_height = 16 * 16;

-- 
--
Randy Wright            Usmail: Hewlett Packard Enterprise
Email: rwright at hpe.com          Servers Linux Enablement
Phone: (970) 898-0998           3404 E. Harmony Rd, Mailstop 36
                                Fort Collins, CO 80528-9599 


More information about the Intel-gfx mailing list