[PATCH 6/7] drm/i915/pmu: Lazy unregister

Tue Jul 23 15:30:08 UTC 2024

On Tue, Jul 23, 2024 at 09:03:25AM GMT, Tvrtko Ursulin wrote:
>
>On 22/07/2024 22:06, Lucas De Marchi wrote:
>>Instead of calling perf_pmu_unregister() when unbinding, defer that to
>>the destruction of i915 object. Since perf itself holds a reference in
>>the event, this only happens when all events are gone, which guarantees
>>i915 is not unregistering the pmu with live events.
>>
>>Previously, running the following sequence would crash the system after
>>~2 tries:
>>
>>	1) bind device to i915
>>	2) wait events to show up on sysfs
>>	3) start perf  stat -I 1000 -e i915/rcs0-busy/
>>	4) unbind driver
>>	5) kill perf
>>
>>Most of the time this crashes in perf_pmu_disable() while accessing the
>>percpu pmu_disable_count. This happens because perf_pmu_unregister()
>>destroys it with free_percpu(pmu->pmu_disable_count).
>>
>>With a lazy unbind, the pmu is only unregistered after (5) as opposed to
>>after (4). The downside is that if a new bind operation is attempted for
>>the same device/driver without killing the perf process, i915 will fail
>>to register the pmu (but still load successfully). This seems better
>>than completely crashing the system.
>
>So effectively allows unbind to succeed without fully unbinding the 
>driver from the device? That sounds like a significant drawback and if 
>so, I wonder if a more complicated solution wouldn't be better after 
>all. Or is there precedence for allowing userspace keeping their paws 
>on unbound devices in this way?

keeping the resources alive but "unplunged" while the hardware
disappeared is a common thing to do... it's the whole point of the
drmm-managed resource for example. If you bind the driver and then
unbind it while userspace is holding a ref, next time you try to bind it
will come up with a different card number. A similar thing that could be
done is to adjust the name of the event - currently we add the mangled
pci slot.

That said, I agree a better approach would be to allow
perf_pmu_unregister() to do its job even when there are open events. On
top of that (or as a way to help achieve that), make perf core replace
the callbacks with stubs when pmu is unregistered - that would even kill
the need for i915's checks on pmu->closed (and fix the lack thereof in
other drivers).

It can be a can of worms though and may be pushed back by perf core
maintainers, so it'd be good have their feedback.

thanks
Lucas De Marchi

>
>Regards,
>
>Tvrtko
>
>>
>>Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>>---
>>  drivers/gpu/drm/i915/i915_pmu.c | 24 +++++++++---------------
>>  1 file changed, 9 insertions(+), 15 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
>>index 8708f905f4f4..df53a8fe53ec 100644
>>--- a/drivers/gpu/drm/i915/i915_pmu.c
>>+++ b/drivers/gpu/drm/i915/i915_pmu.c
>>@@ -1158,18 +1158,21 @@ static void free_pmu(struct drm_device *dev, void *res)
>>  	struct i915_pmu *pmu = res;
>>  	struct drm_i915_private *i915 = pmu_to_i915(pmu);
>>+	perf_pmu_unregister(&pmu->base);
>>  	free_event_attributes(pmu);
>>  	kfree(pmu->base.attr_groups);
>>  	if (IS_DGFX(i915))
>>  		kfree(pmu->name);
>>+
>>+	/*
>>+	 * Make sure all currently running (but shortcut on pmu->closed) are
>>+	 * gone before proceeding with free'ing the pmu object embedded in i915.
>>+	 */
>>+	synchronize_rcu();
>>  }
>>  static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
>>  {
>>-	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>>-
>>-	GEM_BUG_ON(!pmu->base.event_init);
>>-
>>  	/* Select the first online CPU as a designated reader. */
>>  	if (cpumask_empty(&i915_pmu_cpumask))
>>  		cpumask_set_cpu(cpu, &i915_pmu_cpumask);
>>@@ -1182,8 +1185,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
>>  	struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
>>  	unsigned int target = i915_pmu_target_cpu;
>>-	GEM_BUG_ON(!pmu->base.event_init);
>>-
>>  	/*
>>  	 * Unregistering an instance generates a CPU offline event which we must
>>  	 * ignore to avoid incorrectly modifying the shared i915_pmu_cpumask.
>>@@ -1337,21 +1338,14 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
>>  {
>>  	struct i915_pmu *pmu = &i915->pmu;
>>-	if (!pmu->base.event_init)
>>-		return;
>>-
>>  	/*
>>-	 * "Disconnect" the PMU callbacks - since all are atomic synchronize_rcu
>>-	 * ensures all currently executing ones will have exited before we
>>-	 * proceed with unregistration.
>>+	 * "Disconnect" the PMU callbacks - unregistering the pmu will be done
>>+	 * later when all currently open events are gone
>>  	 */
>>  	pmu->closed = true;
>>-	synchronize_rcu();
>>  	hrtimer_cancel(&pmu->timer);
>>-
>>  	i915_pmu_unregister_cpuhp_state(pmu);
>>-	perf_pmu_unregister(&pmu->base);
>>  	pmu->base.event_init = NULL;
>>  }