[Intel-gfx] [RFC, 1/4] drm/i915: Convert requests to use struct fence

Tue Apr 7 04:18:33 PDT 2015

Hey,

Op 07-04-15 om 12:59 schreef John Harrison:
> On 07/04/2015 10:18, Maarten Lankhorst wrote:
>> Hey,
>>
>> Op 20-03-15 om 18:48 schreef John.C.Harrison at Intel.com:
>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> There is a construct in the linux kernel called 'struct fence' that is intended
>>> to keep track of work that is executed on hardware. I.e. it solves the basic
>>> problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
>>> request structure does quite a lot more than simply track the execution progress
>>> so is very definitely still required. However, the basic completion status side
>>> could be updated to use the ready made fence implementation and gain all the
>>> advantages that provides.
>>>
>>> This patch makes the first step of integrating a struct fence into the request.
>>> It replaces the explicit reference count with that of the fence. It also
>>> replaces the 'is completed' test with the fence's equivalent. Currently, that
>>> simply chains on to the original request implementation. A future patch will
>>> improve this.
>>>
>>> For: VIZ-5190
>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.h         |   37 +++++++++------------
>>>   drivers/gpu/drm/i915/i915_gem.c         |   55 ++++++++++++++++++++++++++++---
>>>   drivers/gpu/drm/i915/intel_lrc.c        |    1 +
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |    1 +
>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |    3 ++
>>>   5 files changed, 70 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index ce3a536..7dcaf8c 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -50,6 +50,7 @@
>>>   #include <linux/intel-iommu.h>
>>>   #include <linux/kref.h>
>>>   #include <linux/pm_qos.h>
>>> +#include <linux/fence.h>
>>>     /* General customization:
>>>    */
>>> @@ -2048,7 +2049,11 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>>>    * initial reference taken using kref_init
>>>    */
>>>   struct drm_i915_gem_request {
>>> -    struct kref ref;
>>> +    /** Underlying object for implementing the signal/wait stuff.
>>> +      * NB: Never call fence_later()! Due to lazy allocation, scheduler
>>> +      * re-ordering, pre-emption, etc., there is no guarantee at all
>>> +      * about the validity or sequentialiaty of the fence's seqno! */
>>> +    struct fence fence;
>> Set fence.context differently for each per context timeline. :-)
>
> Yeah, I didn't like the way the description for fence_later() says 'returns NULL if both fences are signaled' and then also returns null on a context mismatch. I was also not entirely sure what the fence context thing is meant to be for. AFAICT, the expectation is that there is only supposed to be a finite and small number of contexts as there is no management of them. They are simply an incrementing number with no way to 'release' a previously allocated context. Whereas, the i915 context is per application in an execlist enabled system. Potentially, multiple contexts per application even. So there is an unbounded and large number of them about. That sounds like a bad idea for the fence context implementation!
No memory is allocated for them, they're just numbers. Worst thing that can happen is an integer overflow, if that would ever happen we can bump it to int64_t. :-)

If you allocate 1000 contexts/second for 50 days you'll hit the overflow, realistically that will never happen.. I wouldn't worry about it.

>
>>> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>>> +{
>>> +    WARN(true, "Is this required?");
>>> +    return true;
>>> +}
>> Yes, try calling fence_wait() on the fence. :-) This function should call irq_get and add itself to ring->irq_queue.
>> See for an example radeon_fence_enable_signaling.
>
> See patch three in the series :). The above warning should really say 'This should not be required yet.' but I didn't get around to updating it.
Oke.

>
>>> @@ -2557,6 +2596,8 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>>>         return ret;
>>>     }
>>>
>>> +    fence_init(&request->fence, &i915_gem_request_fops, &ring->fence_lock, ring->fence_context, request->seqno);
>>> +
>>>     /*
>>>      * Reserve space in the ring buffer for all the commands required to
>>>      * eventually emit this request. This is to guarantee that the
>> Use ring->irq_queue.lock instead of making a new lock? This will make implementing enable_signaling easier too.
>
> Is that definitely safe? It won't cause conflicts or unnecessary complications? Indeed, is one supposed to play around with the implicit lock inside a wait queue?
It's your own workqueue, it's the only way to add a waiter reliably. The spinlock's not taken unless absolutely needed.

~Maarten