[Intel-gfx] [PATCH v2 01/14] drm/i915: Keep a global seqno per-engine

Thu Feb 16 08:10:07 UTC 2017

On 15/02/2017 21:49, Chris Wilson wrote:
> On Wed, Feb 15, 2017 at 05:05:40PM +0000, Tvrtko Ursulin wrote:
>>
>> On 14/02/2017 09:54, Chris Wilson wrote:
>>> Replace the global device seqno with one for each engine, and account
>>> for in-flight seqno on each separately. This is consistent with
>>> dma-fence as each timeline has separate fence-contexts for each engine
>>> and a seqno is only ordered within a fence-context (i.e.  seqno do not
>>> need to be ordered wrt to other engines, just ordered within a single
>>> engine). This is required to enable request rewinding for preemption on
>>> individual engines (we have to rewind the global seqno to avoid
>>> overflow, and we do not have to rewind all engines just to preempt one.)
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>> drivers/gpu/drm/i915/i915_debugfs.c      |  5 +--
>>> drivers/gpu/drm/i915/i915_gem_request.c  | 68 +++++++++++++++-----------------
>>> drivers/gpu/drm/i915/i915_gem_request.h  |  8 +---
>>> drivers/gpu/drm/i915/i915_gem_timeline.h |  4 +-
>>> drivers/gpu/drm/i915/intel_breadcrumbs.c | 33 +++++++---------
>>> drivers/gpu/drm/i915/intel_engine_cs.c   |  2 -
>>> drivers/gpu/drm/i915/intel_ringbuffer.h  |  4 +-
>>> 7 files changed, 52 insertions(+), 72 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index cda957c674ee..9b636962cab6 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -1080,10 +1080,7 @@ static const struct file_operations i915_error_state_fops = {
>>> static int
>>> i915_next_seqno_get(void *data, u64 *val)
>>> {
>>> -	struct drm_i915_private *dev_priv = data;
>>> -
>>> -	*val = 1 + atomic_read(&dev_priv->gt.global_timeline.seqno);
>>> -	return 0;
>>> +	return -ENODEV;
>>
>> I assume reason for leaving this function in this state appears in a
>> later patch? gt.global_timeline stays around for something else?
>
> There's no longer a single global seqno, so we tell userspace (igt) it can't
> have it.

I missed this is debugfs and that we even have this facility. Does the 
exact errno matters here? Thinking of just dropping the vfunc entirely 
and letting the core return an error. After looking it up it seems it 
would be -EACCES.

>>> @@ -325,15 +328,19 @@ static int i915_gem_init_global_seqno(struct drm_i915_private *i915, u32 seqno)
>>> 	GEM_BUG_ON(i915->gt.active_requests > 1);
>>>
>>> 	/* If the seqno wraps around, we need to clear the breadcrumb rbtree */
>>> -	if (!i915_seqno_passed(seqno, atomic_read(&timeline->seqno))) {
>>> -		while (intel_breadcrumbs_busy(i915))
>>> -			cond_resched(); /* spin until threads are complete */
>>> -	}
>>> -	atomic_set(&timeline->seqno, seqno);
>>> +	for_each_engine(engine, i915, id) {
>>> +		struct intel_timeline *tl = &timeline->engine[id];
>>>
>>> -	/* Finally reset hw state */
>>> -	for_each_engine(engine, i915, id)
>>> +		if (!i915_seqno_passed(seqno, tl->seqno)) {
>>> +			/* spin until threads are complete */
>>> +			while (intel_breadcrumbs_busy(engine))
>>> +				cond_resched();
>>> +		}
>>> +
>>> +		/* Finally reset hw state */
>>> +		tl->seqno = seqno;
>>> 		intel_engine_init_global_seqno(engine, seqno);
>>> +	}
>>
>> Came back here a bit later. Shouldn't you just handle one engine in
>> this function if seqnos are per-engine now?
>
> No. We still have multiple engines listening to the seqno of others
> (legacy semaphores). So if we wraparound on RCS we have to idle xCS to
> be sure they complete any semaphores (semaphores check for >= value, so
> if we set future requests to be smaller, they have to wait for a long
> time before a new RCS request overtakes the semaphore value).

Ah right, forgot about semaphores.

>>> 	/* We may be recursing from the signal callback of another i915 fence */
>>> 	spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
>>> 	request->global_seqno = seqno;
>>
>> This field could also be renamed to engine_seqno to be more
>> self-documenting.
>
> That's going to be wide-sweeping change, let's see what it looks like,
> e.g. i915_gem_request_get_engine_seqno()
>
> On the other hand, I thought I called the timeline "[global]"

Okay if it is too much churn never mind then. I guess we can think of 
req->global_seqno as global to the engine ourselves. :

 >>> It would be better for the active seqno count to be managed on the
 >>> same level for readability. By that I mean having the decrement in
 >>> add_request where it was incremented.
 >>
 >> It's incremented in this function, so the unwind on error is here as
 >> well.
 >
 > Ah, I guess you were referring to the decrement in request_alloc. Pulled
 > that out to unreserve_seqno() to match the call to reserve_seqno().

Yes, I said the wrong thing. Got confused by jumping back and forth.

Regards,

Tvrtko