[Intel-gfx] [PATCH] drm/i915: Engine relative MMIO

Mon May 20 06:19:58 UTC 2019

On 17/05/2019 02:25, John Harrison wrote:
> On 5/15/2019 01:52, Tvrtko Ursulin wrote:
>>
>> On 13/05/2019 20:45, John.C.Harrison at Intel.com wrote:
>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> With virtual engines, it is no longer possible to know which specific
>>> physical engine a given request will be executed on at the time that
>>> request is generated. This means that the request itself must be engine
>>> agnostic - any direct register writes must be relative to the engine
>>> and not absolute addresses.
>>>
>>> The LRI command has support for engine relative addressing. However,
>>> the mechanism is not transparent to the driver. The scheme for Gen11
>>> (MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
>>> absolute engine base component. The hardware then adds on the correct
>>> engine offset at execution time.
>>>
>>> Due to the non-trivial and differing schemes on different hardware, it
>>> is not possible to simply update the code that creates the LRI
>>> commands to set a remap flag and let the hardware get on with it.
>>> Instead, this patch adds function wrappers for generating the LRI
>>> command itself and then for constructing the correct address to use
>>> with the LRI.
>>>
>>> v2: Fix build break in GVT. Remove flags parameter [review feedback
>>> from Chris W].
>>>
>>> v3: Fix build break in selftest. Rebase to newer base tree and fix
>>> merge conflict.
>>>
>>> v4: More rebasing. Rmoved relative addressing support from Gen7-9 only
>>> code paths [review feedback from Chris W].
>>>
>>> v5: More rebasing (new 'gt' directory). Fix white space issue. Use
>>> COPY class rather than BCS0 id for checking against BCS engine.
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>> CC: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>> CC: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>>> CC: Wilson, Chris P <chris.p.wilson at intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_engine.h        |  4 +
>>>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     | 11 +++
>>>   drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |  6 +-
>>>   drivers/gpu/drm/i915/gt/intel_lrc.c           | 79 ++++++++++---------
>>>   drivers/gpu/drm/i915/gt/intel_lrc_reg.h       |  4 +-
>>>   drivers/gpu/drm/i915/gt/intel_mocs.c          | 17 ++--
>>>   drivers/gpu/drm/i915/gt/intel_ringbuffer.c    | 45 +++++++++--
>>>   drivers/gpu/drm/i915/gt/intel_workarounds.c   |  4 +-
>>>   .../gpu/drm/i915/gt/selftest_workarounds.c    |  9 ++-
>>>   drivers/gpu/drm/i915/gvt/mmio_context.c       | 16 +++-
>>>   drivers/gpu/drm/i915/i915_cmd_parser.c        |  4 +-
>>>   drivers/gpu/drm/i915/i915_gem_context.c       | 12 +--
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  3 +-
>>>   drivers/gpu/drm/i915/i915_perf.c              | 19 +++--
>>>   14 files changed, 154 insertions(+), 79 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h 
>>> b/drivers/gpu/drm/i915/gt/intel_engine.h
>>> index 9359b3a7ad9c..3506c992182c 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine.h
>>> @@ -546,4 +546,8 @@ static inline bool inject_preempt_hang(struct 
>>> intel_engine_execlists *execlists)
>>>     #endif
>>>   +bool i915_engine_has_relative_lri(const struct intel_engine_cs 
>>> *engine);
>>> +u32 i915_get_lri_cmd(const struct intel_engine_cs *engine, u32 
>>> word_count);
>>> +u32 i915_get_lri_reg(const struct intel_engine_cs *engine, 
>>> i915_reg_t reg);
>>> +
>>>   #endif /* _INTEL_RINGBUFFER_H_ */
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
>>> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> index 4c3753c1b573..233295d689d2 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
>>> @@ -253,6 +253,17 @@ static u32 __engine_mmio_base(struct 
>>> drm_i915_private *i915,
>>>       return bases[i].base;
>>>   }
>>>   +bool i915_engine_has_relative_lri(const struct intel_engine_cs 
>>> *engine)
>>> +{
>>> +    if (INTEL_GEN(engine->i915) < 11)
>>> +        return false;
>>> +
>>> +    if (engine->class == COPY_ENGINE_CLASS)
>>> +        return false;
>>> +
>>> +    return true;
>>
>> I think engine->flags would be better. At least it is one conditional 
>> instead of two at runtime, even one too many for something that is 
>> invariant.
>>
>>> +}
>>> +
>>>   static void __sprint_engine_name(char *name, const struct 
>>> engine_info *info)
>>>   {
>>>       WARN_ON(snprintf(name, INTEL_ENGINE_CS_MAX_NAME, "%s%u",
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
>>> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>>> index a34ece53a771..e7784b3fb759 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>>> @@ -123,9 +123,13 @@
>>>    *   simply ignores the register load under certain conditions.
>>>    * - One can actually load arbitrary many arbitrary registers: 
>>> Simply issue x
>>>    *   address/value pairs. Don't overdue it, though, x <= 2^4 must 
>>> hold!
>>> + * - Newer hardware supports engine relative addresses but older 
>>> hardware does
>>> + *   not. So never call MI_LRI directly, always use the 
>>> i915_get_lri_cmd()
>>> + *   and i915_get_lri_reg() helper functions.
>>>    */
>>> -#define MI_LOAD_REGISTER_IMM(x)    MI_INSTR(0x22, 2*(x)-1)
>>> +#define __MI_LOAD_REGISTER_IMM(x)    MI_INSTR(0x22, 2*(x)-1)
>>
>> So we end up with code using __MI_LOAD_REGISTER_IMM for absolute, and 
>> i915_get_lri_cmd for relative addressing. One is a macro, another is a 
>> function call, and naming/case is different.
> 
> No. The __M_L_R_IMM version is for old code that can only run on 
> pre-Gen11 devices. The helper function is for all other code. The caller 
> does not know (or care) whether it should be using absolute or relative 
> addressing. That is the point of the helper.
> 
> See earlier discussion about wanting to make it totally obvious that 
> using the simple macro version is the wrong thing to do in new code.
> 
> 
>> ...
>>
>> Then all call sites can use just this single helper and the naming 
>> remains familiar.
>>
> 
> I originally had a single helper to be used in all call sites. Chris 
> objected violently to the idea of calling a helper in code that is 
> purely pre-Gen11 and thus has no need of the helper.
> 
> On the other hand, Rodrigo agreed with you that using the helper 
> everywhere was cleaner. So it is now a 2 vs 1 vote...

Maybe leaving the legacy code use the existing MI_LOAD_REGISTER_IMM 
then, and just calling the new one MI_LOAD_REGISTER_IMM_REL would be 
passable? It would satisfy my complaint that two helpers look so 
radically different. Would make Chris happy and would make the patch 
smaller I think.

Regards,

Tvrtko