[PATCH 1/3] drm/i915: Fix context IDs not released on driver hot unbind

Janusz Krzysztofik janusz.krzysztofik at linux.intel.com
Thu May 2 10:56:49 UTC 2019


From: Janusz Krzysztofik <janusz.krzysztofik at intel.com>

In case the driver gets unbound while a device is open, kernel panic
may be forced if a list of allocated context IDs is not empty.

When a device is open, the list may happen to be not empty because a
context ID, once obtained from a context ID allocator for a context
assosiated with that open file descriptor, is released as late as
on device close.

On the other hand, there is a need to release all allocated context IDs
and destroy the context ID allocator on driver unbind, even if a device
is open, in order to free memory resources consumed and prevent from
memory leaks.  The purpose of the forced kernel panic was to protect
the context ID allocator from being silently destroyed if not all
allocated IDs had been released.

Since we have recently prevented users from still accessing device
resources as soon as the device is unregistered, we may now free those
resource even if a user keeps the device open.

Before forcing the kernel panic on non-empty list of allocated context
IDs, do that on unlikely non-empty list of contexts that should be
freed by preceding drain of work queue (there must be another bug if
that list happens to be not empty).  If empty, we may assume that
remaining contexts are idle (not pinned) and their IDs can be safely
released as long as they are not going to be used anymore.

Once done, verify if device resources are protected from unwanted
user access by the device being marked unplugged, then release context
IDs of each of those remaining contexts unless it happens a context is
unlikely pinned.  Force kernel panic in that case as there must be
still another bug in the driver code.

Now the kernel panic protecting the allocator should not pop up as the
list it checks should be empty.  If it unlikely happens to be not
empty, there must be still another bug.

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik at intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 280813a4bf82..7b3c027de1a0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -611,6 +611,8 @@ void i915_gem_contexts_lost(struct drm_i915_private *dev_priv)
 
 void i915_gem_contexts_fini(struct drm_i915_private *i915)
 {
+	struct i915_gem_context *ctx, *cn;
+
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
 	if (i915->preempt_context)
@@ -618,6 +620,17 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
 	destroy_kernel_context(&i915->kernel_context);
 
 	/* Must free all deferred contexts (via flush_workqueue) first */
+	GEM_BUG_ON(!llist_empty(&i915->contexts.free_list));
+
+	if (drm_dev_is_unplugged(&i915->drm)) {
+		/* Release all remaining contexts */
+		list_for_each_entry_safe(ctx, cn, &i915->contexts.hw_id_list,
+					 hw_id_link) {
+			GEM_BUG_ON(atomic_read(&ctx->hw_id_pin_count));
+			release_hw_id(ctx);
+		}
+	}
+
 	GEM_BUG_ON(!list_empty(&i915->contexts.hw_id_list));
 	ida_destroy(&i915->contexts.hw_ida);
 }
-- 
2.20.1



More information about the Intel-gfx-trybot mailing list