[Intel-gfx] [PATCH 0/1] GuC submission vs request signaling race
John.C.Harrison at Intel.com
John.C.Harrison at Intel.com
Tue Nov 28 09:10:58 UTC 2017
From: John Harrison <John.C.Harrison at Intel.com>
Back in the days of 4.11, there was a nested_enable_signaling()
function as part of the GuC submission path. It contained a BUG_ON to
ensure that the request being processed had not already been signaled.
However, there was a race condition that causes that BUG_ON to be hit.
We have a customer that is currently locked in to the 4.11 tree and
therefore requires a fix for this issue that does not involve updating
to the latest upstream kernel (which has re-written this code).
The race is that the nested_enable_signaling() call is made after the
submission to the GuC within i915_guc_dequeue(). That allows a very
fast request to finish before the call to nested_ is made. Hence the
request has indeed completed, been signaled and the BUG fires.
The following patch works around the issue by reversing the order of
the events such that nested_ is called immediately before
i915_guc_submit() instead of sometime after. This guarantees that the
request cannot complete before signaling has been enabled. However, it
also means that signaling is now enabled for all requests. Previously,
it was only done for the last request in a coelesced group. That is,
multiple requests for the same context could be queued up back to back
and only the last would be enabled for signaling.
My understanding is that the omission of enabling signaling for
coelesced requests is simply a performance optimisation. And that
losing it will have no detrimental side effects other than the extra
signal processing. That is, it is safe to make this change on the
grounds that anything more involved that both fixes the race and keeps
the optimisation would be a much bigger and therefore more risky
change. However, others have expressed a view that this is a very
complex area of code and there may be unforseen consequences of this
change. Hence the post here to ask for feedback.
Thanks.
John Harrison (1):
drm/i915: Fix for nested_enable_signaling BUG_ON
drivers/gpu/drm/i915/i915_guc_submission.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--
2.13.0
More information about the Intel-gfx
mailing list