[PATCH 15/15] cpuhp bug workaround

Tvrtko Ursulin tursulin at ursulin.net
Thu Oct 19 17:48:46 UTC 2017

From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Register a dummy hotplug notifier immediately after the real one.

This solves the issue of sticky st->node in cpuhp, where if the
last register has set it, it will persist and then on the following
hotplug event confuse cpuhp_thread_fun.

It will call cpuhp_invoke_callback with this sticky st->node passed
in as node, which will make it run the path for single add/remove

Registering and unregistering a dummy notifier as the last steps
ensures st->node is NULL, regardless of the registration order at
boot, and especially after module reload. In the latter i915 is
guaranteed to be the last and so would trigger the sticky st->node
problem, crashing the machine (if the planets are correctly aligned),
or as minimum have the potential for silent data corruption.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
 drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index e2ce66159041..a39ae60c36d0 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -777,6 +777,11 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
+static int i915_pmu_cpu_tmp_wa(unsigned int cpu)
+	return 0;
 static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
 static int i915_pmu_register_cpuhp_state(struct drm_i915_private *i915)
@@ -803,6 +808,13 @@ static int i915_pmu_register_cpuhp_state(struct drm_i915_private *i915)
 	cpuhp_slot = slot;
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
+				"perf/x86/intel/i915:tmpwa",
+				i915_pmu_cpu_tmp_wa, i915_pmu_cpu_tmp_wa);
+	WARN_ON(ret < 0);
+	if (ret > 0)
+		cpuhp_remove_state(ret);
 	return 0;

More information about the Intel-gfx-trybot mailing list