[Intel-gfx] [RFC 7/9] drm/i915: Interrupt driven fences

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Mon Jul 20 02:09:04 PDT 2015


Op 17-07-15 om 16:31 schreef John.C.Harrison at Intel.com:
> From: John Harrison <John.C.Harrison at Intel.com>
>
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
This race will happen on any enable_signaling implementation, just something to be aware of.
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.

I notice in this commit you only clean up requests when the refcount drops to 0.
What resources need to be kept after the fence is signaled? Can't you queue the
delayed work from the function that signals the fence and make sure it holds a
refcount?

I'm curious because if userspace can grab a reference for a fence it might lead
to not freeing resources for a long time..

~Maarten



More information about the Intel-gfx mailing list