[Intel-gfx] [PATCH 17/50] drm/i915: Switch back to an array of logical per-engine HW contexts
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Apr 12 14:37:22 UTC 2019
On 12/04/2019 14:43, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-12 14:31:31)
>>
>> On 12/04/2019 09:53, Chris Wilson wrote:
>>> @@ -875,22 +927,27 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>>> i915_active_init(i915, &cb->base, cb_retire);
>>> i915_active_acquire(&cb->base);
>>>
>>> - rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
>>> - struct intel_engine_cs *engine = ce->engine;
>>> + for_each_gem_engine(it, ctx) {
>>> + struct intel_context *ce = it.context;
>>> struct i915_request *rq;
>>>
>>> - if (!(engine->mask & engines))
>>> + if (!(ce->engine->mask & engines))
>>> + continue;
>>> +
>>> + if (!intel_context_is_pinned(ce))
>>> continue;
>>>
>>> if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
>>> - engine->mask)) {
>>> + ce->engine->mask)) {
>>> err = -ENXIO;
>>> + i915_gem_engines_iter_fini(&it);
>>
>> This would be our only iterator which needs manual cleanup. But only on
>> abort, not normal termination.
>>
>> Would it be cleaner if for_each_gem_engine took engines which the caller
>> would previously have to obtain via i915_gem_context_engine_list_lock?
>> And the caller would also be obviously responsible for unlocking.
>
> But then it's just a for (i = 0; i < engines->num_engines; i++) again.
> I'm not really convinced that if the caller has to do everything anyway,
> replacing the for() is much of a gain :)
>
>>> break;
>>> }
>>>
>>> - rq = i915_request_alloc(engine, ctx);
>>> + rq = intel_context_create_request(ce);
>>> if (IS_ERR(rq)) {
>>> err = PTR_ERR(rq);
>>> + i915_gem_engines_iter_fini(&it);
>>> break;
>>> }
>>>
>>> @@ -901,8 +958,10 @@ static int context_barrier_task(struct i915_gem_context *ctx,
>>> err = i915_active_ref(&cb->base, rq->fence.context, rq);
>>>
>>> i915_request_add(rq);
>>> - if (err)
>>> + if (err) {
>>> + i915_gem_engines_iter_fini(&it);
>>> break;
>>> + }
>>> }
>>>
>>> cb->task = err ? NULL : task; /* caller needs to unwind instead */
>>> @@ -1739,6 +1798,43 @@ int __i915_gem_context_pin_hw_id(struct i915_gem_context *ctx)
>>> return err;
>>> }
>>>
>>> +/* GEM context-engines iterator: for_each_gem_engine() */
>>> +bool i915_gem_engines_iter_next(struct i915_gem_engines_iter *it)
>>> +{
>>> + if (!it->engines)
>>> + goto end;
>>
>> When can engines be null? Should it be GEM_BUG_ON if onyl when used
>> outside the for_each_gem_engine?
>
> Hmm, it can't. I seem to have been unusually defensive!
>
>>> + do {
>>> + if (it->idx >= it->engines->num_engines)
>>> + goto end;
>>> +
>>> + it->context = it->engines->engines[it->idx++];
>>> + } while (!it->context);
>>> +
>>> + return true;
>>> +
>>> +end:
>>> + i915_gem_engines_iter_fini(it);
>>> + return false;
>>> +}
>>> +
>>> +void i915_gem_engines_iter_init(struct i915_gem_engines_iter *it,
>>> + struct i915_gem_context *ctx)
>>> +{
>>> + it->gem_context = ctx;
>>> + it->engines = i915_gem_context_engine_list_lock(ctx);
>>> + it->idx = 0;
>>
>> Clear it->context just in case?
>>
>>> +}
>>> +
>>> +void i915_gem_engines_iter_fini(struct i915_gem_engines_iter *it)
>>> +{
>>> + struct i915_gem_context *ctx;
>>> +
>>> + ctx = fetch_and_zero(&it->gem_context);
>>
>> This is iter_fini can be called multiple times? When does that come to
>> play? Oh... when i915_gem_engines_iter_next went past the array and the
>> loop decides to bail out.
>
> Yeah, I was just worrying about the mix of the user calling
> iter_fini();
> continue;
>
>> If iterator took engines as parameter this could also go.
>
> Equally if there was no iterator :)
>
> So is the iterator worth it? If I have to do the locking and list management
> in the caller, I would say no. Yay or nay?
I think yes. My arguments are:
1. Removes the need for every loop to add a skip on !ce.
2. Removes mixing up two different concepts - device and context engines.
3. With the virtual engine added for_each_ctx_engine also walks over
that one. Don't think we have a use for this yet though.
I am just not sure if like the one in your patch or a simpler one which
leaves obtaining/releasing the engine list to the caller. I am leaning
towards the latter because even though it is extra work, so is managing
the iter_fini requirement. So they end up almost equal in this respect,
but simpler version more explicit and so harder to mess up.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list