[Intel-gfx] [PATCH 2/2] drm/i915: Use rcu instead of stop_machine in set_wedged
Chris Wilson
chris at chris-wilson.co.uk
Tue Oct 10 09:21:45 UTC 2017
Quoting Daniel Vetter (2017-10-09 17:44:01)
> stop_machine is not really a locking primitive we should use, except
> when the hw folks tell us the hw is broken and that's the only way to
> work around it.
>
> This patch tries to address the locking abuse of stop_machine() from
>
> commit 20e4933c478a1ca694b38fa4ac44d99e659941f5
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date: Tue Nov 22 14:41:21 2016 +0000
>
> drm/i915: Stop the machine as we install the wedged submit_request handler
>
> Chris said parts of the reasons for going with stop_machine() was that
> it's no overhead for the fast-path. But these callbacks use irqsave
> spinlocks and do a bunch of MMIO, and rcu_read_lock is _real_ fast.
>
> To stay as close as possible to the stop_machine semantics we first
> update all the submit function pointers to the nop handler, then call
> synchronize_rcu() to make sure no new requests can be submitted. This
> should give us exactly the huge barrier we want.
>
> I pondered whether we should annotate engine->submit_request as __rcu
> and use rcu_assign_pointer and rcu_dereference on it. But the reason
> behind those is to make sure the compiler/cpu barriers are there for
> when you have an actual data structure you point at, to make sure all
> the writes are seen correctly on the read side. But we just have a
> function pointer, and .text isn't changed, so no need for these
> barriers and hence no need for annotations.
>
> Unfortunately there's a complication with the call to
> intel_engine_init_global_seqno:
This is still broken in the same way as nop_submit_request may execute
while you sleep, breaking cancel_requests.
-Chris
More information about the Intel-gfx
mailing list