[Intel-gfx] [PATCH v12] drm/i915: Extend LRC pinning to cover GPU context writeback
Nick Hoath
nicholas.hoath at intel.com
Tue Jan 26 01:43:42 PST 2016
On 25/01/2016 18:19, Daniel Vetter wrote:
> On Fri, Jan 22, 2016 at 02:25:27PM +0000, Nick Hoath wrote:
>> Use the first retired request on a new context to unpin
>> the old context. This ensures that the hw context remains
>> bound until it has been written back to by the GPU.
>> Now that the context is pinned until later in the request/context
>> lifecycle, it no longer needs to be pinned from context_queue to
>> retire_requests.
>> This fixes an issue with GuC submission where the GPU might not
>> have finished writing back the context before it is unpinned. This
>> results in a GPU hang.
>>
>> v2: Moved the new pin to cover GuC submission (Alex Dai)
>> Moved the new unpin to request_retire to fix coverage leak
>> v3: Added switch to default context if freeing a still pinned
>> context just in case the hw was actually still using it
>> v4: Unwrapped context unpin to allow calling without a request
>> v5: Only create a switch to idle context if the ring doesn't
>> already have a request pending on it (Alex Dai)
>> Rename unsaved to dirty to avoid double negatives (Dave Gordon)
>> Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
>> Split out per engine cleanup from context_free as it
>> was getting unwieldy
>> Corrected locking (Dave Gordon)
>> v6: Removed some bikeshedding (Mika Kuoppala)
>> Added explanation of the GuC hang that this fixes (Daniel Vetter)
>> v7: Removed extra per request pinning from ring reset code (Alex Dai)
>> Added forced ring unpin/clean in error case in context free (Alex Dai)
>> v8: Renamed lrc specific last_context to lrc_last_context as there
>> were some reset cases where the codepaths leaked (Mika Kuoppala)
>> NULL'd last_context in reset case - there was a pointer leak
>> if someone did reset->close context.
>> v9: Rebase over "Fix context/engine cleanup order"
>> v10: Rebase over nightly, remove WARN_ON which caused the
>> dependency on dev.
>> v11: Kick BAT rerun
>> v12: Rebase
>>
>> Signed-off-by: Nick Hoath <nicholas.hoath at intel.com>
>> Issue: VIZ-4277
>
> When resending patches, please include everyone who ever commented on this
> in Cc: lines here. It's for the record and helps in assigning blame when
> things inevitably blow up again ;-)
Even when it's just a resend to cause a BAT run for coverage?
> -Daniel
>
>> ---
>> drivers/gpu/drm/i915/intel_lrc.c | 37 +++++++++++++++----------------------
>> 1 file changed, 15 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index dbf3729..b469817 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -779,10 +779,10 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>> if (intel_ring_stopped(request->ring))
>> return 0;
>>
>> - if (request->ctx != ring->default_context) {
>> - if (!request->ctx->engine[ring->id].dirty) {
>> + if (request->ctx != request->ctx->i915->kernel_context) {
>> + if (!request->ctx->engine[request->ring->id].dirty) {
>> intel_lr_context_pin(request);
>> - request->ctx->engine[ring->id].dirty = true;
>> + request->ctx->engine[request->ring->id].dirty = true;
>> }
>> }
>>
>> @@ -2447,9 +2447,7 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
>> struct drm_i915_gem_object *ctx_obj,
>> struct intel_ringbuffer *ringbuf)
>> {
>> - int ret;
>> -
>> - if (ctx == ring->default_context) {
>> + if (ctx == ctx->i915->kernel_context) {
>> intel_unpin_ringbuffer_obj(ringbuf);
>> i915_gem_object_ggtt_unpin(ctx_obj);
>> }
>> @@ -2463,13 +2461,10 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
>> * otherwise create a switch to idle request
>> */
>> if (list_empty(&ring->request_list)) {
>> - int ret;
>> -
>> - ret = i915_gem_request_alloc(
>> + req = i915_gem_request_alloc(
>> ring,
>> - ring->default_context,
>> - &req);
>> - if (!ret)
>> + NULL);
>> + if (!IS_ERR(req))
>> i915_add_request(req);
>> else
>> DRM_DEBUG("Failed to ensure context saved");
>> @@ -2479,6 +2474,8 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
>> typeof(*req), list);
>> }
>> if (req) {
>> + int ret;
>> +
>> ret = i915_wait_request(req);
>> if (ret != 0) {
>> /**
>> @@ -2515,17 +2512,13 @@ void intel_lr_context_free(struct intel_context *ctx)
>> struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
>> struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;
>>
>> - if (!ctx_obj)
>> - continue;
>> -
>> - if (ctx == ctx->i915->kernel_context) {
>> - intel_unpin_ringbuffer_obj(ringbuf);
>> - i915_gem_object_ggtt_unpin(ctx_obj);
>> - }
>> + if (ctx_obj)
>> + intel_lr_context_clean_ring(
>> + ctx,
>> + ringbuf->ring,
>> + ctx_obj,
>> + ringbuf);
>>
>> - WARN_ON(ctx->engine[i].pin_count);
>> - intel_ringbuffer_free(ringbuf);
>> - drm_gem_object_unreference(&ctx_obj->base);
>> }
>> }
>>
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
More information about the Intel-gfx
mailing list