[Intel-gfx] [PATCH v2] drm/i915/uc: Start preparing GuC/HuC for reset
Sagar Arun Kamble
sagar.a.kamble at intel.com
Tue Feb 27 07:07:48 UTC 2018
On 2/26/2018 10:27 PM, Daniele Ceraolo Spurio wrote:
>
>
> On 25/02/18 22:17, Sagar Arun Kamble wrote:
>>
>>
>> On 2/23/2018 10:31 PM, Daniele Ceraolo Spurio wrote:
>>>
>>>
>>> On 23/02/18 06:04, Michal Wajdeczko wrote:
>>>> Right after GPU reset there will be a small window of time during
>>>> which
>>>> some of GuC/HuC fields will still show state before reset. Let's start
>>>> to fix that by sanitizing firmware status as we will use it shortly.
>>>>
>>>> v2: s/reset_prepare/prepare_to_reset (Michel)
>>>> don't forget about gem_sanitize path (Daniele)
>>>>
>>>> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>>> Cc: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>>> Cc: Michel Thierry <michel.thierry at intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_gem.c | 5 ++++-
>>>> drivers/gpu/drm/i915/intel_guc.h | 5 +++++
>>>> drivers/gpu/drm/i915/intel_huc.h | 5 +++++
>>>> drivers/gpu/drm/i915/intel_uc.c | 14 ++++++++++++++
>>>> drivers/gpu/drm/i915/intel_uc.h | 1 +
>>>> drivers/gpu/drm/i915/intel_uc_fw.h | 6 ++++++
>>>> 6 files changed, 35 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>>>> b/drivers/gpu/drm/i915/i915_gem.c
>>>> index 14c855b..ae2c4ba 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -2981,6 +2981,7 @@ int i915_gem_reset_prepare(struct
>>>> drm_i915_private *dev_priv)
>>>> }
>>>> i915_gem_revoke_fences(dev_priv);
>>>> + intel_uc_prepare_to_reset(dev_priv);
>>>> return err;
>>>> }
>>>> @@ -4881,8 +4882,10 @@ void i915_gem_sanitize(struct
>>>> drm_i915_private *i915)
>>>> * it may impact the display and we are uncertain about the
>>>> stability
>>>> * of the reset, so this could be applied to even earlier gen.
>>>> */
>>>> - if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
>>>> + if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915)) {
>>>> + intel_uc_prepare_to_reset(i915);
>>>
>>> This leaves the status with an incorrect value if we boot with
>>> i915.reset=0,
>> It depends on whether WOPCM is locked (In case of resume from S3 I
>> have seen it to be the case often).
>> Then we need not reload GuC also unless we are not doing full GPU reset.
>>> but I think this is still the right place to add this in.
>> Yes
>>> There are several things with GuC that are going to break if we use
>>> reset=0 (e.g. doorbell cleanup)
>> Can you elaborate how it might break.
>> i915 isn't currently communicating to GuC (destroy_doorbell) during
>> doorbell cleanup and if we start communicating then it should
>> not fail as GuC will be available with reset=0. Also
>> __intel_uc_reset_hw isn't gated by reset modparam.
>
> As you said we do always reset GuC no matter the value of the
> modparam, but that does not reset the doorbell HW. If we're coming out
> of S3 and the state as been preserved this will cause the doorbell HW
> to be left in an unclean state, which could cause spurious doorbell
> interrupts to be sent to GuC, not sure how the firmware handles those.
> The code as moved since last time I looked at this in detail and I
> think we're now most likely going to overwrite those unclean
> doorbells, but there are unlikely corner cases (preempt context
> failing to be created) where we might not do so.
> More generally, my concern was that in the code flow we assume GuC and
> related HW to be reset and in need of a re-init when we come out of
> suspend when actually as you reported that might not be the case if we
> have reset=0. Even if we have no major concerns now, issues might
> arise in the future after code reworks or new feature additions if we
> start from a wrong assumption. Instead of changing the flow to
> consider the reset=0 (which isn't really a supported scenario) I think
> it'd be more useful to just enforce the fact that we don't support
> that use-case with GuC, hence my suggestion. And yes, I'm probably
> just being uber-paranoid :P
>
Makes sense ..... Agree on sanitizing with GuC to now allow reset=0
We could also fix this if we could reset doorbell unit alone at resume
and acquire needed doorbells but AFAIK earlier guc_init_doorbell_hw is
the way to reset all doorbells (that needed GuC). As you said we can
skip these changes though since reset=0 isn't supported scenario.
> Daniele
>
>>> so I wouldn't consider this a regression, but we might want to start
>>> sanitizing the modparams to not allow reset=0 with GuC.
>>>
>>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>>
>>> Daniele
>>>
>>>> WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
>>>> + }
>>>> }
>>>> int i915_gem_suspend(struct drm_i915_private *dev_priv)
>>>> diff --git a/drivers/gpu/drm/i915/intel_guc.h
>>>> b/drivers/gpu/drm/i915/intel_guc.h
>>>> index 52856a9..0f6adb1 100644
>>>> --- a/drivers/gpu/drm/i915/intel_guc.h
>>>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>>>> @@ -132,4 +132,9 @@ static inline u32 guc_ggtt_offset(struct
>>>> i915_vma *vma)
>>>> struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc,
>>>> u32 size);
>>>> u32 intel_guc_wopcm_size(struct drm_i915_private *dev_priv);
>>>> +static inline void intel_guc_prepare_to_reset(struct intel_guc
>>>> *guc)
>>>> +{
>>>> + intel_uc_fw_prepare_to_reset(&guc->fw);
>>>> +}
>>>> +
>>>> #endif
>>>> diff --git a/drivers/gpu/drm/i915/intel_huc.h
>>>> b/drivers/gpu/drm/i915/intel_huc.h
>>>> index 40039db..96e24f9 100644
>>>> --- a/drivers/gpu/drm/i915/intel_huc.h
>>>> +++ b/drivers/gpu/drm/i915/intel_huc.h
>>>> @@ -38,4 +38,9 @@ struct intel_huc {
>>>> int intel_huc_init_hw(struct intel_huc *huc);
>>>> int intel_huc_auth(struct intel_huc *huc);
>>>> +static inline void intel_huc_prepare_to_reset(struct intel_huc
>>>> *huc)
>>>> +{
>>>> + intel_uc_fw_prepare_to_reset(&huc->fw);
>>>> +}
>>>> +
>>>> #endif
>>>> diff --git a/drivers/gpu/drm/i915/intel_uc.c
>>>> b/drivers/gpu/drm/i915/intel_uc.c
>>>> index 9f1bac6..8042d4b 100644
>>>> --- a/drivers/gpu/drm/i915/intel_uc.c
>>>> +++ b/drivers/gpu/drm/i915/intel_uc.c
>>>> @@ -445,3 +445,17 @@ void intel_uc_fini_hw(struct drm_i915_private
>>>> *dev_priv)
>>>> if (USES_GUC_SUBMISSION(dev_priv))
>>>> gen9_disable_guc_interrupts(dev_priv);
>>>> }
>>>> +
>>>> +void intel_uc_prepare_to_reset(struct drm_i915_private *i915)
>>>> +{
>>>> + struct intel_huc *huc = &i915->huc;
>>>> + struct intel_guc *guc = &i915->guc;
>>>> +
>>>> + if (!USES_GUC(i915))
>>>> + return;
>>>> +
>>>> + GEM_BUG_ON(!HAS_GUC(i915));
>>>> +
>>>> + intel_huc_prepare_to_reset(huc);
>>>> + intel_guc_prepare_to_reset(guc);
>>>> +}
>>>> diff --git a/drivers/gpu/drm/i915/intel_uc.h
>>>> b/drivers/gpu/drm/i915/intel_uc.h
>>>> index f2984e0..7a8ae58 100644
>>>> --- a/drivers/gpu/drm/i915/intel_uc.h
>>>> +++ b/drivers/gpu/drm/i915/intel_uc.h
>>>> @@ -39,6 +39,7 @@
>>>> void intel_uc_fini_hw(struct drm_i915_private *dev_priv);
>>>> int intel_uc_init(struct drm_i915_private *dev_priv);
>>>> void intel_uc_fini(struct drm_i915_private *dev_priv);
>>>> +void intel_uc_prepare_to_reset(struct drm_i915_private *dev_priv);
>>>> static inline bool intel_uc_is_using_guc(void)
>>>> {
>>>> diff --git a/drivers/gpu/drm/i915/intel_uc_fw.h
>>>> b/drivers/gpu/drm/i915/intel_uc_fw.h
>>>> index d5fd460..f1ee653 100644
>>>> --- a/drivers/gpu/drm/i915/intel_uc_fw.h
>>>> +++ b/drivers/gpu/drm/i915/intel_uc_fw.h
>>>> @@ -115,6 +115,12 @@ static inline bool
>>>> intel_uc_fw_is_selected(struct intel_uc_fw *uc_fw)
>>>> return uc_fw->path != NULL;
>>>> }
>>>> +static inline void intel_uc_fw_prepare_to_reset(struct
>>>> intel_uc_fw *uc_fw)
>>>> +{
>>>> + if (uc_fw->load_status == INTEL_UC_FIRMWARE_SUCCESS)
>>>> + uc_fw->load_status = INTEL_UC_FIRMWARE_PENDING;
>>>> +}
>>>> +
>>>> void intel_uc_fw_fetch(struct drm_i915_private *dev_priv,
>>>> struct intel_uc_fw *uc_fw);
>>>> int intel_uc_fw_upload(struct intel_uc_fw *uc_fw,
>>>>
>>
--
Thanks,
Sagar
More information about the Intel-gfx
mailing list