[Intel-gfx] [PATCH] drm/i915: Mark up racy read of rq->engine
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Apr 23 14:53:44 UTC 2020
On 23/04/2020 12:58, Chris Wilson wrote:
> As the i915_request.engine may be updated by a virtual engine to either
> point to the virtual engine or the real physical engine on submission,
> we have to be wary that the engine pointer may change.
>
> [ 213.317076] BUG: KCSAN: data-race in execlists_dequeue [i915] / i915_request_wait [i915]
> [ 213.317097]
> [ 213.317110] write (marked) to 0xffff8881e8647650 of 8 bytes by interrupt on cpu 2:
> [ 213.317386] execlists_dequeue+0x43b/0x1670 [i915]
> [ 213.317645] __execlists_submission_tasklet+0x48/0x60 [i915]
> [ 213.317905] execlists_submission_tasklet+0xd3/0x170 [i915]
> [ 213.317926] tasklet_action_common.isra.0+0x42/0x90
> [ 213.317943] __do_softirq+0xc8/0x206
> [ 213.317958] irq_exit+0xcd/0xe0
> [ 213.317980] irq_work_interrupt+0xf/0x20
> [ 213.317999] __tsan_read8+0x30/0x100
> [ 213.318272] retire_requests+0xdd/0xf0 [i915]
> [ 213.318502] engine_retire+0xa6/0xe0 [i915]
> [ 213.318519] process_one_work+0x3af/0x640
> [ 213.318534] worker_thread+0x80/0x670
> [ 213.318548] kthread+0x19a/0x1e0
> [ 213.318566] ret_from_fork+0x1f/0x30
> [ 213.318584]
> [ 213.318595] read to 0xffff8881e8647650 of 8 bytes by task 458 on cpu 1:
> [ 213.318847] i915_request_wait+0x3e3/0x510 [i915]
> [ 213.319088] i915_gem_object_wait_fence+0x81/0xa0 [i915]
> [ 213.319328] i915_gem_object_wait+0x26b/0x560 [i915]
> [ 213.319578] i915_gem_wait_ioctl+0x141/0x290 [i915]
> [ 213.319597] drm_ioctl_kernel+0xe9/0x130
> [ 213.319613] drm_ioctl+0x27d/0x45e
> [ 213.319628] ksys_ioctl+0x89/0xb0
> [ 213.319648] __x64_sys_ioctl+0x42/0x60
> [ 213.319666] do_syscall_64+0x6e/0x2c0
> [ 213.319680] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> In this case, we are merely trying to flush the most recent engine
> associated with the request, and do not care which as the process of
> chaing the engine is done by a tasklet, with which we are yielding to.
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> ---
> drivers/gpu/drm/i915/i915_request.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 22635bbabf06..e9fd20242438 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1660,7 +1660,7 @@ long i915_request_wait(struct i915_request *rq,
> break;
> }
>
> - intel_engine_flush_submission(rq->engine);
> + intel_engine_flush_submission(READ_ONCE(rq->engine));
>
> if (signal_pending_state(state, current)) {
> timeout = -ERESTARTSYS;
>
What with the mutex_acquire/release in this case? No practical effect
but they are also dereferencing rq->engine... Take a copy of engine for
lockdep at start and another read for engine flushing in the loop?
Regards,
Tvrtko
More information about the Intel-gfx
mailing list