[Intel-gfx] [PATCH v5 12/18] drm/i915/guc: Provide register list to be saved/restored during engine reset

Michel Thierry michel.thierry at intel.com
Fri Apr 14 00:30:43 UTC 2017


On 13/04/17 17:16, Daniele Ceraolo Spurio wrote:
>
>
> On 24/03/17 18:30, Michel Thierry wrote:
>> From: Arun Siluvery <arun.siluvery at linux.intel.com>
>>
>> GuC expects a list of registers from the driver which are saved/restored
>> during engine reset. The type of value to be saved is controlled by
>> flags. We provide a minimal set of registers that we want GuC to save and
>> restore. This is not an issue in case of engine reset as driver
>> initializes
>> most of them following an engine reset, but in case of media reset (aka
>> watchdog reset) which is completely internal to GuC (including
>> resubmission
>> of hung workload), it is necessary to provide this list, otherwise GuC
>> won't
>> be able to schedule further workloads after a reset. This is the minimal
>> set of registers identified for things to work as expected but if we see
>> any new issues, this register list can be expanded.
>>
>> Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
>> Signed-off-by: Jeff McGee <jeff.mcgee at intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry at intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_guc_submission.c | 50
>> +++++++++++++++++++++++++++++-
>>  1 file changed, 49 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 991e76e10f82..2445af96aa71 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -1015,6 +1015,22 @@ static void guc_policies_init(struct
>> guc_policies *policies)
>>      policies->is_valid = 1;
>>  }
>>
>> +/*
>> + * In this macro it is highly unlikely to exceed max value but even
>> if we did
>> + * it is not an error so just throw a warning and continue. Only side
>> effect
>> + * in continuing further means some registers won't be added to
>> save/restore
>> + * list.
>> + */
>> +#define GUC_ADD_MMIO_REG_ADS(node, reg_addr, _flags)            \
>> +    do {                                \
>> +        u32 __count = node->number_of_registers;        \
>> +        if (WARN_ON(__count >= GUC_REGSET_MAX_REGISTERS))    \
>> +            continue;                    \
>> +        node->registers[__count].offset = reg_addr.reg;        \
>> +        node->registers[__count].flags = (_flags);        \
>> +        node->number_of_registers++;                \
>> +    } while (0)
>> +
>>  static int guc_ads_create(struct intel_guc *guc)
>>  {
>>      struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> @@ -1047,10 +1063,42 @@ static int guc_ads_create(struct intel_guc *guc)
>>
>>      /* MMIO reg state */
>>      for_each_engine(engine, dev_priv, id) {
>> +        u32 flags;
>> +        struct guc_mmio_regset *eng_reg =
>> +            &blob->reg_state.engine_reg[engine->guc_id];
>> +
>> +        /*
>> +         * Provide a list of registers to be saved/restored during gpu
>> +         * reset. This is mainly required for Media reset (aka watchdog
>> +         * timeout) which is completely under the control of GuC
>> +         * (resubmission of hung workload is handled inside GuC).
>> +         */
>> +        flags = GUC_REGSET_POWERCYCLE |
>
> This flag doesn't really do anything inside GuC, I guess it is probably
> a remnant of some functionality that has been removed.
>
>> +            GUC_REGSET_ENGINERESET |
>> +            GUC_REGSET_SAVE_CURRENT_VALUE;
>> +
>> +        GUC_ADD_MMIO_REG_ADS(eng_reg, RING_HEAD(engine->mmio_base),
>> +                     flags);
>> +
>> +        GUC_ADD_MMIO_REG_ADS(eng_reg, RING_TAIL(engine->mmio_base),
>> +                     flags);
>
> Aren't head & tail context save/restored? there should be no need to
> have GuC restore them. Also, won't restoring them manually generate
> activity on the engine?
>
>> +
>> +        GUC_ADD_MMIO_REG_ADS(eng_reg, RING_HWS_PGA(engine->mmio_base),
>> +                     flags);
>> +
>> +        GUC_ADD_MMIO_REG_ADS(eng_reg, RING_MODE_GEN7(engine),
>> +                     (flags | GUC_REGSET_MASKED));
>
> GUC_REGSET_MASKED doesn't do anything either (surprise!). Since this is

s/doesn't do anything/guc is full of bugs/  :smirk:

> actually something that is required, I've chatted with the GuC guys and
> they said they're going to look into re-adding proper functionality. In
> the meantime the suggestion is to use GUC_REGSET_SAVE_DEFAULT_VALUE, in
> which case the guc after the reset will write whatever we set in
> node->registers[__count].value and we can put an already masked value in
> there.
>
> Thanks,
> Daniele
>

Thanks, I'll add it to the changes in the next version.

>> +
>> +        GUC_ADD_MMIO_REG_ADS(eng_reg, RING_IMR(engine->mmio_base),
>> +                     flags);
>> +
>> +        DRM_DEBUG_DRIVER("%s register save/restore count: %u\n",
>> +                 engine->name, eng_reg->number_of_registers);
>> +
>> +         /* Nothing to be white listed for now. */
>>          blob->reg_state.white_list[engine->guc_id].mmio_start =
>>              engine->mmio_base + GUC_MMIO_WHITE_LIST_START;
>>
>> -        /* Nothing to be saved or restored for now. */
>>          blob->reg_state.white_list[engine->guc_id].count = 0;
>>      }
>>
>>


More information about the Intel-gfx mailing list