[Intel-gfx] [PATCH 2/3] drm/i915/execlists: Read the context-status buffer from the HWSP

Michel Thierry michel.thierry at intel.com
Thu Jul 13 00:40:32 UTC 2017


On 7/12/2017 3:58 PM, Chris Wilson wrote:
> The engine provides a mirror of the CSB in the HWSP. If we use the
> cacheable reads from the HWSP, we can shave off a few mmio reads per
> context-switch interrupt (which are quite frequent!). Just removing a
> couple of mmio is not enough to actually reduce any latency, but a small
> reduction in overall cpu usage.
> 
> Much appreciation for Ben dropping the bombshell that the CSB was in the
> HWSP and for Michel in digging out the details.
> 
> Suggested-by: Ben Widawsky <benjamin.widawsky at intel.com>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Michel Thierry <michel.thierry at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Mika Kuoppala <mika.kuoppala at intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 9 ++++-----
>   1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9d231d0e427d..e413465a552b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -547,8 +547,8 @@ static void intel_lrc_irq_handler(unsigned long data)
>   	while (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
>   		u32 __iomem *csb_mmio =
>   			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_PTR(engine));
> -		u32 __iomem *buf =
> -			dev_priv->regs + i915_mmio_reg_offset(RING_CONTEXT_STATUS_BUF_LO(engine, 0));
> +		/* The HWSP contains a (cacheable) mirror of the CSB */
> +		u32 *buf = &engine->status_page.page_addr[0x10];

I would add the dword offset as a define (and next to the other ones we 
already have for the HWSP). It'll also help the next patch, which reads 
the head, i.e.:

@@ -500,4 +500,5 @@ intel_write_status_page(struct intel_engine_cs 
*engine, int reg, u32 value)
   *
   * The area from dword 0x30 to 0x3ff is available for driver usage.
   */
+#define I915_GEM_HWS_CSB_START         0x10
  #define I915_GEM_HWS_INDEX             0x30

And surprisingly, there's already an old comment about these dwords' 
reserved meaning (so they have just been reused);

  * The following dwords have a reserved meaning:
  * 0x00: ISR copy, updated when an ISR bit not set in the HWSTAM
...
<< * 0x10-0x1b: Context status DWords (GM45)  >>
<< * 0x1f: Last written status offset. (GM45) >>
    * 0x20-0x2f: Reserved (Gen6+)

But yes, I can confirm it works in skl too.

Acked-by: Michel Thierry <michel.thierry at intel.com>

>   		unsigned int head, tail;
>   
>   		/* The write will be ordered by the uncached read (itself
> @@ -590,13 +590,12 @@ static void intel_lrc_irq_handler(unsigned long data)
>   			 * status notifier.
>   			 */
>   
> -			status = readl(buf + 2 * head);
> +			status = buf[2 * head];
>   			if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
>   				continue;
>   
>   			/* Check the context/desc id for this event matches */
> -			GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
> -					 port->context_id);
> +			GEM_DEBUG_BUG_ON(buf[2 * head + 1] != port->context_id);
>   
>   			rq = port_unpack(port, &count);
>   			GEM_BUG_ON(count == 0);
> 


More information about the Intel-gfx mailing list