[Intel-gfx] [PATCH v9 1/6] drm/i915: Add per context timelines for fence objects

Thu Jun 9 17:22:00 UTC 2016

On 07/06/2016 12:17, Maarten Lankhorst wrote:
> Op 01-06-16 om 19:07 schreef John.C.Harrison at Intel.com:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> The purpose of this patch series is to convert the requst structure to
>> use fence objects for the underlying completion tracking. The fence
>> object requires a sequence number. The ultimate aim is to use the same
>> sequence number as for the request itself (or rather, to remove the
>> request's seqno field and just use the fence's value throughout the
>> driver). However, this is not currently possible and so this patch
>> introduces a separate numbering scheme as an intermediate step.
>>
>> A major advantage of using the fence object is that it can be passed
>> outside of the i915 driver and used externally. The fence API allows
>> for various operations such as combining multiple fences. This
>> requires that fence seqnos within a single fence context be guaranteed
>> in-order. The GPU scheduler that is coming can re-order request
>> execution but not within a single GPU context. Thus the fence context
>> must be tied to the i915 context (and the engine within the context as
>> each engine runs asynchronously).
>>
>> On the other hand, the driver as a whole currently only works with
>> request seqnos that are allocated from a global in-order timeline. It
>> will require a fair chunk of re-work to allow multiple independent
>> seqno timelines to be used. Hence the introduction of a temporary,
>> fence specific timeline. Once the work to update the rest of the
>> driver has been completed then the request can use the fence seqno
>> instead.
>>
>> v2: New patch in series.
>>
>> v3: Renamed/retyped timeline structure fields after review comments by
>> Tvrtko Ursulin.
>>
>> Added context information to the timeline's name string for better
>> identification in debugfs output.
>>
>> v5: Line wrapping and other white space fixes to keep style checker
>> happy.
>>
>> v7: Updated to newer nightly (lots of ring -> engine renaming).
>>
>> v8: Moved to earlier in patch series so no longer needs to remove the
>> quick hack timeline that was being added before.
>>
>> v9: Updated to another newer nightly (changes to context structure
>> naming). Also updated commit message to match previous changes.
>>
>> For: VIZ-5190
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>>   drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_context.c | 16 +++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>>   4 files changed, 78 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 2a88a46..a5f8ad8 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -831,6 +831,19 @@ struct i915_ctx_hang_stats {
>>   	bool banned;
>>   };
>>   
>> +struct i915_fence_timeline {
>> +	char        name[32];
>> +	unsigned    fence_context;
> Should be a u64 now, since commit 76bf0db5543976ef50362db7071da367cb118532
Yeah, that's newer than the tree these patches are based on. Will update 
and rebase...

>> +	unsigned    next;
>> +
>> +	struct i915_gem_context *ctx;
>> +	struct intel_engine_cs *engine;
>> +};
>> +
>> +int i915_create_fence_timeline(struct drm_device *dev,
>> +			       struct i915_gem_context *ctx,
>> +			       struct intel_engine_cs *ring);
>> +
>>   /* This must match up with the value previously used for execbuf2.rsvd1. */
>>   #define DEFAULT_CONTEXT_HANDLE 0
>>   
>> @@ -875,6 +888,7 @@ struct i915_gem_context {
>>   		u64 lrc_desc;
>>   		int pin_count;
>>   		bool initialised;
>> +		struct i915_fence_timeline fence_timeline;
>>   	} engine[I915_NUM_ENGINES];
>>   
>>   	struct list_head link;
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 5ffc6fa..57d3593 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2743,6 +2743,46 @@ void i915_gem_request_free(struct kref *req_ref)
>>   	kmem_cache_free(req->i915->requests, req);
>>   }
>>   
>> +int i915_create_fence_timeline(struct drm_device *dev,
>> +			       struct i915_gem_context *ctx,
>> +			       struct intel_engine_cs *engine)
>> +{
>> +	struct i915_fence_timeline *timeline;
>> +
>> +	timeline = &ctx->engine[engine->id].fence_timeline;
>> +
>> +	if (timeline->engine)
>> +		return 0;
> Do you ever expect a reinit?
No. Will change to a WARN_ON as per Tvrtko's comment.

>> +	timeline->fence_context = fence_context_alloc(1);
>> +
>> +	/*
>> +	 * Start the timeline from seqno 0 as this is a special value
>> +	 * that is reserved for invalid sync points.
>> +	 */
>> +	timeline->next       = 1;
>> +	timeline->ctx        = ctx;
>> +	timeline->engine     = engine;
>> +
>> +	snprintf(timeline->name, sizeof(timeline->name), "%d>%s:%d",
>> +		 timeline->fence_context, engine->name, ctx->user_handle);
>> +
>> +	return 0;
>> +}
>> +
> On top of the other comments, you might want to add a TODO comment that there should be only one timeline for each context,
> with each engine having only a unique fence->context.
That is not the plan. This patch implements the safest possible option 
of a timeline per engine context. Chris's idea was that the timeline 
should be per VM (struct i915_address_space) instead. Although to me 
that sounds like it would cause problems with requests of multiple 
contexts within a single VM being reordered by the scheduler. Thus 
causing out of order completion on a single timeline. So for the moment, 
I am leaving it as is.

>
> ~Maarten