[Intel-gfx] [PATCH] drm/i915: Avoid dereferencing a dead context

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Apr 29 13:42:44 UTC 2020


On 28/04/2020 10:02, Chris Wilson wrote:
> Once the intel_context is closed, the GEM context may be freed and so
> the link from intel_context.gem_context is invalid.
> 
> <3>[  219.782944] BUG: KASAN: use-after-free in intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <3>[  219.782996] Read of size 8 at addr ffff8881d7dff0b8 by task kworker/0:1/12
> 
> <4>[  219.783052] CPU: 0 PID: 12 Comm: kworker/0:1 Tainted: G     U            5.7.0-rc2-g1f3ffd7683d54-kasan_118+ #1
> <4>[  219.783055] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
> <4>[  219.783105] Workqueue: events heartbeat [i915]
> <4>[  219.783109] Call Trace:
> <4>[  219.783113]  <IRQ>
> <4>[  219.783119]  dump_stack+0x96/0xdb
> <4>[  219.783177]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <4>[  219.783182]  print_address_description.constprop.6+0x16/0x310
> <4>[  219.783239]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <4>[  219.783295]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <4>[  219.783300]  __kasan_report+0x137/0x190
> <4>[  219.783359]  ? intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <4>[  219.783366]  kasan_report+0x32/0x50
> <4>[  219.783426]  intel_engine_coredump_alloc+0x1bc3/0x2250 [i915]
> <4>[  219.783481]  execlists_reset+0x39c/0x13d0 [i915]
> <4>[  219.783494]  ? mark_held_locks+0x9e/0xe0
> <4>[  219.783546]  ? execlists_hold+0xfc0/0xfc0 [i915]
> <4>[  219.783551]  ? lockdep_hardirqs_on+0x348/0x5f0
> <4>[  219.783557]  ? _raw_spin_unlock_irqrestore+0x34/0x60
> <4>[  219.783606]  ? execlists_submission_tasklet+0x118/0x3a0 [i915]
> <4>[  219.783615]  tasklet_action_common.isra.14+0x13b/0x410
> <4>[  219.783623]  ? __do_softirq+0x1e4/0x9a7
> <4>[  219.783630]  __do_softirq+0x226/0x9a7
> <4>[  219.783643]  do_softirq_own_stack+0x2a/0x40
> <4>[  219.783647]  </IRQ>
> <4>[  219.783692]  ? heartbeat+0x3e2/0x10f0 [i915]
> <4>[  219.783696]  do_softirq.part.13+0x49/0x50
> <4>[  219.783700]  __local_bh_enable_ip+0x1a2/0x1e0
> <4>[  219.783748]  heartbeat+0x409/0x10f0 [i915]
> <4>[  219.783801]  ? __live_idle_pulse+0x9f0/0x9f0 [i915]
> <4>[  219.783806]  ? lock_acquire+0x1ac/0x8a0
> <4>[  219.783811]  ? process_one_work+0x811/0x1870
> <4>[  219.783827]  ? rcu_read_lock_sched_held+0x9c/0xd0
> <4>[  219.783832]  ? rcu_read_lock_bh_held+0xb0/0xb0
> <4>[  219.783836]  ? _raw_spin_unlock_irq+0x1f/0x40
> <4>[  219.783845]  process_one_work+0x8ca/0x1870
> <4>[  219.783848]  ? lock_acquire+0x1ac/0x8a0
> <4>[  219.783852]  ? worker_thread+0x1d0/0xb80
> <4>[  219.783864]  ? pwq_dec_nr_in_flight+0x2c0/0x2c0
> <4>[  219.783870]  ? do_raw_spin_lock+0x129/0x290
> <4>[  219.783886]  worker_thread+0x82/0xb80
> <4>[  219.783895]  ? __kthread_parkme+0xaf/0x1b0
> <4>[  219.783902]  ? process_one_work+0x1870/0x1870
> <4>[  219.783906]  kthread+0x34e/0x420
> <4>[  219.783911]  ? kthread_create_on_node+0xc0/0xc0
> <4>[  219.783918]  ret_from_fork+0x3a/0x50
> 
> <3>[  219.783950] Allocated by task 1264:
> <4>[  219.783975]  save_stack+0x19/0x40
> <4>[  219.783978]  __kasan_kmalloc.constprop.3+0xa0/0xd0
> <4>[  219.784029]  i915_gem_create_context+0xa2/0xab8 [i915]
> <4>[  219.784081]  i915_gem_context_create_ioctl+0x1fa/0x450 [i915]
> <4>[  219.784085]  drm_ioctl_kernel+0x1d8/0x270
> <4>[  219.784088]  drm_ioctl+0x676/0x930
> <4>[  219.784092]  ksys_ioctl+0xb7/0xe0
> <4>[  219.784096]  __x64_sys_ioctl+0x6a/0xb0
> <4>[  219.784100]  do_syscall_64+0x94/0x530
> <4>[  219.784103]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> 
> <3>[  219.784120] Freed by task 12:
> <4>[  219.784141]  save_stack+0x19/0x40
> <4>[  219.784145]  __kasan_slab_free+0x130/0x180
> <4>[  219.784148]  kmem_cache_free_bulk+0x1bd/0x500
> <4>[  219.784152]  kfree_rcu_work+0x1d8/0x890
> <4>[  219.784155]  process_one_work+0x8ca/0x1870
> <4>[  219.784158]  worker_thread+0x82/0xb80
> <4>[  219.784162]  kthread+0x34e/0x420
> <4>[  219.784165]  ret_from_fork+0x3a/0x50
> 
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 4d54dba35302..a976cd67b3b3 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1207,8 +1207,6 @@ static void engine_record_registers(struct intel_engine_coredump *ee)
>   static void record_request(const struct i915_request *request,
>   			   struct i915_request_coredump *erq)
>   {
> -	const struct i915_gem_context *ctx;
> -
>   	erq->flags = request->fence.flags;
>   	erq->context = request->fence.context;
>   	erq->seqno = request->fence.seqno;
> @@ -1218,9 +1216,13 @@ static void record_request(const struct i915_request *request,
>   
>   	erq->pid = 0;
>   	rcu_read_lock();
> -	ctx = rcu_dereference(request->context->gem_context);
> -	if (ctx)
> -		erq->pid = pid_nr(ctx->pid);
> +	if (!intel_context_is_closed(request->context)) {
> +		const struct i915_gem_context *ctx;
> +
> +		ctx = rcu_dereference(request->context->gem_context);
> +		if (ctx)
> +			erq->pid = pid_nr(ctx->pid);
> +	}
>   	rcu_read_unlock();
>   }
>   
> 

In the client busyness series I move the GEM ctx put to free_engines_rcu 
- at which point the closed check here is not needed any more. Should we 
delay this put right now to simplify? Maybe not.. I'll remember to tweak 
it in my series.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>

Regards,

Tvrtko

P.S. Fixes: 2e46a2a0b0149f951b63be1b5df6514676fed213 ?


More information about the Intel-gfx mailing list