[Intel-gfx] [PATCH] drm/i915: Engine relative MMIO

Fri May 17 01:25:42 UTC 2019

On 5/15/2019 01:52, Tvrtko Ursulin wrote:
>
> On 13/05/2019 20:45, John.C.Harrison at Intel.com wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> With virtual engines, it is no longer possible to know which specific
>> physical engine a given request will be executed on at the time that
>> request is generated. This means that the request itself must be engine
>> agnostic - any direct register writes must be relative to the engine
>> and not absolute addresses.
>>
>> The LRI command has support for engine relative addressing. However,
>> the mechanism is not transparent to the driver. The scheme for Gen11
>> (MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
>> absolute engine base component. The hardware then adds on the correct
>> engine offset at execution time.
>>
>> Due to the non-trivial and differing schemes on different hardware, it
>> is not possible to simply update the code that creates the LRI
>> commands to set a remap flag and let the hardware get on with it.
>> Instead, this patch adds function wrappers for generating the LRI
>> command itself and then for constructing the correct address to use
>> with the LRI.
>>
>> v2: Fix build break in GVT. Remove flags parameter [review feedback
>> from Chris W].
>>
>> v3: Fix build break in selftest. Rebase to newer base tree and fix
>> merge conflict.
>>
>> v4: More rebasing. Rmoved relative addressing support from Gen7-9 only
>> code paths [review feedback from Chris W].
>>
>> v5: More rebasing (new 'gt' directory). Fix white space issue. Use
>> COPY class rather than BCS0 id for checking against BCS engine.
>>
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>> CC: Rodrigo Vivi <rodrigo.vivi at intel.com>
>> CC: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>> CC: Wilson, Chris P <chris.p.wilson at intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_engine.h        |  4 +
>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 +++
>>   drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  6 +-
>>   drivers/gpu/drm/i915/gt/intel_lrc.c           | 79 ++++++++++---------
>>   drivers/gpu/drm/i915/gt/intel_lrc_reg.h       |  4 +-
>>   drivers/gpu/drm/i915/gt/intel_mocs.c          | 17 ++--
>>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 45 +++++++++--
>>   drivers/gpu/drm/i915/gt/intel_workarounds.c   |  4 +-
>>   .../gpu/drm/i915/gt/selftest_workarounds.c    |  9 ++-
>>   drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++-
>>   drivers/gpu/drm/i915/i915_cmd_parser.c        |  4 +-
>>   drivers/gpu/drm/i915/i915_gem_context.c       | 12 +--
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  3 +-
>>   drivers/gpu/drm/i915/i915_perf.c              | 19 +++--
>>   14 files changed, 154 insertions(+), 79 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
>> b/drivers/gpu/drm/i915/gt/intel_engine.h
>> index 9359b3a7ad9c..3506c992182c 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
>> @@ -546,4 +546,8 @@ static inline bool inject_preempt_hang(struct 
>> intel_engine_execlists *execlists)
>>     #endif
>>   +bool i915_engine_has_relative_lri(const struct intel_engine_cs 
>> *engine);
>> +u32 i915_get_lri_cmd(const struct intel_engine_cs *engine, u32 
>> word_count);
>> +u32 i915_get_lri_reg(const struct intel_engine_cs *engine, 
>> i915_reg_t reg);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> index 4c3753c1b573..233295d689d2 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>> @@ -253,6 +253,17 @@ static u32 __engine_mmio_base(struct 
>> drm_i915_private *i915,
>>       return bases[i].base;
>>   }
>>   +bool i915_engine_has_relative_lri(const struct intel_engine_cs 
>> *engine)
>> +{
>> +    if (INTEL_GEN(engine->i915) < 11)
>> +        return false;
>> +
>> +    if (engine->class == COPY_ENGINE_CLASS)
>> +        return false;
>> +
>> +    return true;
>
> I think engine->flags would be better. At least it is one conditional 
> instead of two at runtime, even one too many for something that is 
> invariant.
>
>> +}
>> +
>>   static void __sprint_engine_name(char *name, const struct 
>> engine_info *info)
>>   {
>>       WARN_ON(snprintf(name, INTEL_ENGINE_CS_MAX_NAME, "%s%u",
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
>> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> index a34ece53a771..e7784b3fb759 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> @@ -123,9 +123,13 @@
>>    *   simply ignores the register load under certain conditions.
>>    * - One can actually load arbitrary many arbitrary registers: 
>> Simply issue x
>>    *   address/value pairs. Don't overdue it, though, x <= 2^4 must 
>> hold!
>> + * - Newer hardware supports engine relative addresses but older 
>> hardware does
>> + *   not. So never call MI_LRI directly, always use the 
>> i915_get_lri_cmd()
>> + *   and i915_get_lri_reg() helper functions.
>>    */
>> -#define MI_LOAD_REGISTER_IMM(x)    MI_INSTR(0x22, 2*(x)-1)
>> +#define __MI_LOAD_REGISTER_IMM(x)    MI_INSTR(0x22, 2*(x)-1)
>
> So we end up with code using __MI_LOAD_REGISTER_IMM for absolute, and 
> i915_get_lri_cmd for relative addressing. One is a macro, another is a 
> function call, and naming/case is different.

No. The __M_L_R_IMM version is for old code that can only run on 
pre-Gen11 devices. The helper function is for all other code. The caller 
does not know (or care) whether it should be using absolute or relative 
addressing. That is the point of the helper.

See earlier discussion about wanting to make it totally obvious that 
using the simple macro version is the wrong thing to do in new code.

> ...
>
> Then all call sites can use just this single helper and the naming 
> remains familiar.
>

I originally had a single helper to be used in all call sites. Chris 
objected violently to the idea of calling a helper in code that is 
purely pre-Gen11 and thus has no need of the helper.

On the other hand, Rodrigo agreed with you that using the helper 
everywhere was cleaner. So it is now a 2 vs 1 vote...

John.