[Intel-gfx] [PATCH 1/1] drm/i915: Reset request handling for gen9+
Tomas Elf
tomas.elf at intel.com
Tue Jun 16 13:15:56 PDT 2015
On 16/06/2015 18:10, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 04:39:23PM +0300, Mika Kuoppala wrote:
>> In order for skl+ hardware to guarantee that no context switch
>> takes place during reset and that current context is properly
>> saved, the driver needs to notify and query hw before commencing
>> with reset.
>>
>> We will only proceed with reset if all engines report that they
>> are ready for reset.
>>
>> As we skip the reset if any single engine reports not ready, this
>> commit prevents system hang skl in some situations where the
>> gpu/blitter is hanged and in such state that any write to generic
>
> s/is hanged/is wedged/ reads better
>
>> reset register (GEN6_GDRST) causes immediate system hang.
>>
>> References: https://bugs.freedesktop.org/show_bug.cgi?id=89959
>> References: https://bugs.freedesktop.org/show_bug.cgi?id=90854
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_reg.h | 3 +++
>> drivers/gpu/drm/i915/intel_uncore.c | 32 +++++++++++++++++++++++++++++++-
>> 2 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 0b979ad..3684f92 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1461,6 +1461,9 @@ enum skl_disp_power_wells {
>> #define RING_MAX_IDLE(base) ((base)+0x54)
>> #define RING_HWS_PGA(base) ((base)+0x80)
>> #define RING_HWS_PGA_GEN6(base) ((base)+0x2080)
>> +#define RING_RESET_CTL(base) ((base)+0xd0)
>> +#define RESET_CTL_REQUEST_RESET (1 << 0)
>> +#define RESET_CTL_READY_TO_RESET (1 << 1)
>>
>> #define HSW_GTT_CACHE_EN 0x4024
>> #define GTT_CACHE_EN_ALL 0xF0007FFF
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index 4a86cf0..404bce2 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -1455,9 +1455,39 @@ static int gen6_do_reset(struct drm_device *dev)
>> return ret;
>> }
>>
>> +static int wait_for_bits_set(struct drm_i915_private *dev_priv,
>> + const u32 reg, const u32 mask, const int timeout)
>> +{
>> + return wait_for((I915_READ(reg) & mask) == mask, timeout);
>> +}
>> +
>> +static int gen9_do_reset(struct drm_device *dev)
>> +{
>> + struct drm_i915_private *dev_priv = dev->dev_private;
>> + struct intel_engine_cs *engine;
>> + int ret, i;
>> +
>> + for_each_ring(engine, dev_priv, i) {
>> + I915_WRITE(RING_RESET_CTL(engine->mmio_base),
>> + _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET));
>> +
>> + ret = wait_for_bits_set(dev_priv,
>> + RING_RESET_CTL(engine->mmio_base),
>> + RESET_CTL_READY_TO_RESET, 700);
>> + if (ret) {
>> + DRM_ERROR("%s: reset request timeout\n", engine->name);
>> + return -ENODEV;
>
> return -EIO; since the reset didn't happen due to hardware issues
> (ENODEV is that we don't have the implementation for the GPU rather than
> it failed).
>
> Do we need any recovery? Do you guarrantee that the GPU reset resets the
> CTL register?
> -Chris
According to the bspec (if I remember correctly from the last time I had
to deal with it - Mika, correct me if I'm way off here):
If the reset request succeeds the reset request bit is cleared and
ready_to_reset is set. Following the engine reset both ready_to_reset
and reset request bits are set to 0. If the reset request fails the
reset_request bit is obviously still set.
Then again, all of this is assuming engine resets rather than a full GPU
reset. The bspec does not say anything about what the effect of a full
gpu reset is on the reset control registers. It's always seemed to me
like the reset control register is only relevant when doing a per-engine
reset rather than a full GPU reset but I might very well be wrong about
that, especially since you guys have seen problems when not involving
this reset handshake before doing full GPU resets.
Thanks,
Tomas
>
More information about the Intel-gfx
mailing list