[Intel-gfx] [RFC 7/9] drm/i915: Interrupt driven fences
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Jul 27 06:20:43 PDT 2015
On 07/17/2015 03:31 PM, John.C.Harrison at Intel.com wrote:
> From: John Harrison <John.C.Harrison at Intel.com>
>
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer requiring the
> mutex lock.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
> ---
[snip]
> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> list_del_init(&request->list);
> i915_gem_request_remove_from_client(request);
>
> + /* In case the request is still in the signal pending list */
> + if (!list_empty(&request->signal_list))
> + request->cancelled = true;
> +
Another thing I did not see implemented is the sync_fence error state.
This is more about the Android part, but related to this canceled flag
so I am commenting here.
I thought when TDR kicks in and we set request->cancelled to true, there
should be a code path which somehow makes sync_fence->status negative.
As it is, because fence_signal will not be called on canceled, I thought
waiters will wait until timeout, rather than being woken up and return
error status.
For this to work you would somehow need to make sync_fence->status go
negative. With normal fence completion it goes from 1 -> 0, via the
completion callback. I did not immediately see how to make it go
negative using the existing API.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list