[Intel-gfx] [PATCH 1/2] drm/i915/icl: Implement gen11 flush including tile cache

Mika Kuoppala mika.kuoppala at linux.intel.com
Tue Aug 13 07:57:51 UTC 2019


Chris Wilson <chris at chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2019-08-12 17:01:07)
>> Add tile cache flushing for gen11. To relive us from the
>> burden of previous obsolete workarounds, make a dedicated
>> flush/invalidate callback for gen11.
>> 
>> To fortify an independent single flush, do post
>> sync op as there are indications that without it
>> we don't flush everything. This should also make this
>> callback more readily usable in tgl (see l3 fabric flush).
>> 
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>>  drivers/gpu/drm/i915/gt/intel_lrc.c          | 63 +++++++++++++++++++-
>>  2 files changed, 63 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> index 6a0879c27d14..929a17e54f2c 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> @@ -208,6 +208,7 @@
>>  #define   DISPLAY_PLANE_A           (0<<20)
>>  #define   DISPLAY_PLANE_B           (1<<20)
>>  #define GFX_OP_PIPE_CONTROL(len)       ((0x3<<29)|(0x3<<27)|(0x2<<24)|((len)-2))
>> +#define   PIPE_CONTROL_TILE_CACHE_FLUSH                        (1<<28) /* gen11+ */
> Ack.
>
>>  #define   PIPE_CONTROL_FLUSH_L3                                (1<<27)
>>  #define   PIPE_CONTROL_GLOBAL_GTT_IVB                  (1<<24) /* gen7+ */
>>  #define   PIPE_CONTROL_MMIO_WRITE                      (1<<23)
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index b97047d58d3d..891db3c933c9 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -2655,6 +2655,64 @@ static int gen8_emit_flush_render(struct i915_request *request,
>>         return 0;
>>  }
>>  
>> +static int gen11_emit_flush_render(struct i915_request *request,
>> +                                  u32 mode)
>> +{
>> +       struct intel_engine_cs *engine = request->engine;
>> +       const u32 scratch_addr =
>> +               intel_gt_scratch_offset(engine->gt,
>> +                                       INTEL_GT_SCRATCH_FIELD_RENDER_FLUSH);
>> +
>> +       if (mode & EMIT_FLUSH) {
>> +               u32 *cs;
>> +               u32 flags = 0;
>> +
>> +               flags |= PIPE_CONTROL_CS_STALL;
>> +
>> +               flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
>> +               flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
>> +               flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
>> +               flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
>> +               flags |= PIPE_CONTROL_FLUSH_ENABLE;
>> +               flags |= PIPE_CONTROL_QW_WRITE;
>> +               flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
>
> Ok.
>
>> +
>> +               cs = intel_ring_begin(request, 6);
>> +               if (IS_ERR(cs))
>> +                       return PTR_ERR(cs);
>> +
>> +               cs = gen8_emit_pipe_control(cs, flags, scratch_addr);
>> +
>
> Personally, I wouldn't leave a newline here.

Will remove. Pipe control length could be also defined but meh.

>> +               intel_ring_advance(request, cs);
>> +       }
>> +
>> +       if (mode & EMIT_INVALIDATE) {
>> +               u32 *cs;
>> +               u32 flags = 0;
>> +
>> +               flags |= PIPE_CONTROL_CS_STALL;
>> +
>> +               flags |= PIPE_CONTROL_TLB_INVALIDATE;
>> +               flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE;
>> +               flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE;
>> +               flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE;
>> +               flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE;
>> +               flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
>> +               flags |= PIPE_CONTROL_QW_WRITE;
>> +               flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
>
> Ok.
>
>> +
>> +               cs = intel_ring_begin(request, 6);
>> +               if (IS_ERR(cs))
>> +                       return PTR_ERR(cs);
>> +
>> +               cs = gen8_emit_pipe_control(cs, flags, scratch_addr);
>> +
>> +               intel_ring_advance(request, cs);
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>>  /*
>>   * Reserve space for 2 NOOPs at the end of each request to be
>>   * used as a workaround for not being allowed to do lite
>> @@ -2829,7 +2887,10 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
>>         logical_ring_default_irqs(engine);
>>  
>>         if (engine->class == RENDER_CLASS) {
>> -               engine->emit_flush = gen8_emit_flush_render;
>> +               if (INTEL_GEN(engine->i915) >= 11)
>> +                       engine->emit_flush = gen11_emit_flush_render;
>> +               else
>> +                       engine->emit_flush = gen8_emit_flush_render;
>>                 engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
>
> No fini breadcrumb flush?

Well there are room for cleaning there for sure. I admit to being
a bit coward. I don't have an icelake and changing that would
warrant quite amount of gem_concurrent_blt cycles.

If you can convince me that ci/bat will catch my mistakes,
I could reconsider.

-Mika



More information about the Intel-gfx mailing list