[Intel-gfx] [PATCH] drm/i915/gem: Consolidate ctx->engines[] release
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Mar 2 14:20:33 UTC 2020
On 28/02/2020 12:19, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-02-28 12:08:18)
>>
>> On 27/02/2020 11:01, Chris Wilson wrote:
>>> +static void engines_idle_release(struct i915_gem_context *ctx,
>>> + struct i915_gem_engines *engines)
>>> +{
>>> + struct i915_gem_engines_iter it;
>>> + struct intel_context *ce;
>>> +
>>> + i915_sw_fence_init(&engines->fence, engines_notify);
>>> + INIT_LIST_HEAD(&engines->link);
>>> +
>>> + engines->ctx = i915_gem_context_get(ctx);
>>> +
>>> + for_each_gem_engine(ce, engines, it) {
>>> + struct dma_fence *fence;
>>> + int err = 0;
>>> +
>>> + /* serialises with execbuf */
>>> + RCU_INIT_POINTER(ce->gem_context, NULL);
>>
>> What is the purpose of this? Looks dodgy - like it can't really
>> guarantee much.
>
> It should be fine if we do:
>
> execbuf: context_close:
> ce->gem_context = NULL;
> add_to_timeline(); get(&ce->timeline->last_request);
> if (!ce->gem_context)
> return -ENOENT;
>
> If add is before the get, we will wait on it.
> If add is after the get, we will wait on the earlier request, and skip
> this one -- it will not execute.
What guarantees we see the latest last_request here, or that execbuf
sees the NULL before we try the get? The code elsewhere isn't assuming
unstable ce->gem_context I think.. engines we can change, but I don't
remember we accounted for this.
>>> + if (!intel_context_pin_if_active(ce))
>>> + continue;
>>> +
>>> + fence = i915_active_fence_get(&ce->timeline->last_request);
>>> + if (fence) {
>>> + err = i915_sw_fence_await_dma_fence(&engines->fence,
>>> + fence, 0,
>>> + GFP_KERNEL);
>>> + dma_fence_put(fence);
>>> + }
>>> + intel_context_unpin(ce);
>>> + if (err < 0)
>>> + goto kill;
>>> + }
>>> +
>>> + spin_lock_irq(&ctx->stale.lock);
>>> + if (!i915_gem_context_is_closed(ctx))
>>> + list_add_tail(&engines->link, &ctx->stale.engines);
>>> + spin_unlock_irq(&ctx->stale.lock);
>>> +
>>> +kill:
>>> + if (list_empty(&engines->link)) /* raced, already closed */
>>> + kill_engines(engines);
>>
>> Raced with.. ? context_close? Can't be the fence yet, before it has been
>> committed.
>
> Yes, there's still the set_engines vs context_close to worry about.
I'd just say "raced with context_close" then.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list