[Intel-gfx] [PATCH 3/3] drm/i915/execlists: Read the context-status HEAD from the HWSP
Daniele Ceraolo Spurio
daniele.ceraolospurio at intel.com
Wed Jul 12 23:41:38 UTC 2017
On 12/07/17 15:58, Chris Wilson wrote:
> The engine provides also provides mirror of the CSB write pointer in the
> HWSP, but not of our read pointer. To take advantage of this we need to
> remember where we read up to on the last interrupt and continue off from
> there. This poses a problem following a reset, as we don't know where
> the hw will start writing from, and due to the use of power contexts we
> cannot perform that query during the reset itself. So we continue the
> current modus operandi of delaying the first read of the context-status
> read/write pointers until after the first interrupt. With this we should
> now have eliminated all uncached mmio reads in handling the
> context-status interrupt, though we still have the uncached mmio writes
> for submitting new work, and many uncached mmio reads in the global
> interrupt handler itself. Still a step in the right direction towards
> reducing our resubmit latency, although it appears lost in the noise!
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> ---
> drivers/gpu/drm/i915/intel_lrc.c | 20 +++++++++++++++-----
> drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
> 2 files changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index e413465a552b..db750abb905e 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -562,9 +562,15 @@ static void intel_lrc_irq_handler(unsigned long data)
> * is set and we do a new loop.
> */
> __clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> - head = readl(csb_mmio);
> - tail = GEN8_CSB_WRITE_PTR(head);
> - head = GEN8_CSB_READ_PTR(head);
> + if (unlikely(engine->csb_head == -1)) { /* following a reset */
> + head = readl(csb_mmio);
> + tail = GEN8_CSB_WRITE_PTR(head);
> + head = GEN8_CSB_READ_PTR(head);
> + engine->csb_head = head;
> + } else {
> + head = engine->csb_head;
> + tail = buf[0xf];
In CNL the tail moves to offset 0x2f of the HWSP (i.e. buf[0x1f]), might
be worth considering it immediately since CNL support is being merged.
-Daniele
> + }
> while (head != tail) {
> struct drm_i915_gem_request *rq;
> unsigned int status;
> @@ -618,8 +624,11 @@ static void intel_lrc_irq_handler(unsigned long data)
> !(status & GEN8_CTX_STATUS_ACTIVE_IDLE));
> }
>
> - writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
> - csb_mmio);
> + if (head != engine->csb_head) {
> + engine->csb_head = head;
> + writel(_MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, head << 8),
> + csb_mmio);
> + }
> }
>
> if (execlists_elsp_ready(engine))
> @@ -1246,6 +1255,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine)
>
> /* After a GPU reset, we may have requests to replay */
> clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
> + engine->csb_head = -1;
>
> submit = false;
> for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index d33c93444c0d..56751413e40c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -391,6 +391,7 @@ struct intel_engine_cs {
> struct rb_root execlist_queue;
> struct rb_node *execlist_first;
> unsigned int fw_domains;
> + unsigned int csb_head;
>
> /* Contexts are pinned whilst they are active on the GPU. The last
> * context executed remains active whilst the GPU is idle - the
>
More information about the Intel-gfx
mailing list