[Intel-gfx] [PATCH v3] drm/i915: Do not call API requiring struct_mutex where it is not available
Chris Wilson
chris at chris-wilson.co.uk
Wed Jan 13 11:34:54 PST 2016
On Wed, Jan 13, 2016 at 04:16:21PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>
> LRC code was calling GEM API like i915_gem_obj_ggtt_offset from
> places where the struct_mutex cannot be grabbed (irq handlers).
>
> To avoid that this patch caches some interesting bits and values
> in the engine and context structures.
>
> Some usages are also removed where they are not needed like a
> few asserts which are either impossible or have been checked
> already during engine initialization.
>
> Side benefit is also that interrupt handlers and command
> submission stop evaluating invariant conditionals, like what
> Gen we are running on, on every interrupt and every command
> submitted.
>
> This patch deals with logical ring context id and descriptors
> while subsequent patches will deal with the remaining issues.
>
> v2:
> * Cache the VMA instead of the address. (Chris Wilson)
> * Incorporate Dave Gordon's good comments and function name.
>
> v3:
> * Extract ctx descriptor template to a function and group
> functions dealing with ctx descriptor & co together near
> top of the file. (Dave Gordon)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> Cc: Dave Gordon <david.s.gordon at intel.com>
Considering the current state of affairs, this is not bad. I have a
different shade of slightly-off-white that I like but that applies to a
world where execlists is more request focused, e.g.:
static u32 execlists_request_write_tail(struct drm_i915_gem_request *req)
{
struct intel_ring *ring = req->ring;
struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt;
if (ppgtt && !USES_FULL_48BIT_PPGTT(req->i915)) {
/* True 32b PPGTT with dynamic page allocation: update PDP
* registers and point the unallocated PDPs to scratch page.
* PML4 is allocated during ppgtt init, so this is not needed
* in 48-bit mode.
*/
if (ppgtt->pd_dirty_rings & intel_engine_flag(req->engine)) {
ASSIGN_CTX_PDP(ppgtt, ring->registers, 3);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 2);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 1);
ASSIGN_CTX_PDP(ppgtt, ring->registers, 0);
ppgtt->pd_dirty_rings &= ~intel_engine_flag(req->engine);
}
}
ring->registers[CTX_RING_TAIL+1] = req->tail;
return ring->context_descriptor;
}
where the req->ring is much more readily available than
struct intel_context_engine *ce = &req->ctx->engine[req->engine->id];
though a req->context_engine would shut me up entirely.
> +/**
> + * intel_execlists_ctx_id() - get the Execlists Context ID
> + * @ctx: Context to get the ID for
> + * @ring: Engine to get the ID for
> + *
> + * Do not confuse with ctx->id! Unfortunately we have a name overload
> + * here: the old context ID we pass to userspace as a handler so that
> + * they can refer to a context, and the new context ID we pass to the
> + * ELSP so that the GPU can inform us of the context status via
> + * interrupts.
> + *
> + * The context ID is a portion of the context descriptor, so we can
> + * just extract the required part from the cached descriptor.
> + *
> + * Return: 20-bits globally unique context ID.
> + */
> +u32 intel_execlists_ctx_id(struct intel_context *ctx,
> + struct intel_engine_cs *ring)
I would suggest intel_execlists_status_tag (or at least offer it as a
starting point for a non-conflicting name).
> +{
> + return intel_lr_context_descriptor(ctx, ring) >> GEN8_CTX_ID_SHIFT;
> }
Next up on the hit list is struct intel_context_engine!
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 7349d9258191..85ce2272f92c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -269,6 +269,8 @@ struct intel_engine_cs {
> struct list_head execlist_queue;
> struct list_head execlist_retired_req_list;
> u8 next_context_status_buffer;
> + bool disable_lite_restore_wa;
The holes, the holes!
> + u32 ctx_desc_template;
I would suggest this is better named something like
execlist_context_base_descriptor since we already have been using the
execlist_ prefix to distinguish lrc specific state, and being verbose
doesn't really hurt when we only use it twice.
Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list