[Intel-gfx] [PATCH v5 1/4] drm/i915/guc: Add fetch of hwconfig table
Michal Wajdeczko
michal.wajdeczko at intel.com
Fri Feb 25 18:05:31 UTC 2022
On 25.02.2022 18:18, Tvrtko Ursulin wrote:
>
> On 25/02/2022 16:46, John Harrison wrote:
>
>>>> driver we don't care that much that we failed to load HWconfig and
>>>> 'notice' is enough.
>>>>
>>>> but I'm fine with all messages being drm_err (as we will not have to
>>>> change that once again after HWconfig will be mandatory for the driver
>>>> as well)
>>>
>>> I would be against drm_err.
>>>
>>> #define KERN_EMERG KERN_SOH "0" /* system is unusable */
>>> #define KERN_ALERT KERN_SOH "1" /* action must be taken
>>> immediately */
>>> #define KERN_CRIT KERN_SOH "2" /* critical conditions */
>>> #define KERN_ERR KERN_SOH "3" /* error conditions */
>>> #define KERN_WARNING KERN_SOH "4" /* warning conditions */
>>> #define KERN_NOTICE KERN_SOH "5" /* normal but significant
>>> condition */
>>> #define KERN_INFO KERN_SOH "6" /* informational */
>>> #define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
>>>
>>> From the point of view of the kernel driver, this is not an error to
>>> its operation. It can at most be a warning, but notice is also fine
>>> by me. One could argue when reading "normal but significant
>>> condition" that it is not normal, when it is in fact unexpected, so
>>> if people prefer warning that is also okay by me. I still lean
>>> towards notice becuase of the hands-off nature i915 has with the
>>> pass-through of this blob.
>> From the point of view of the KMD, i915 will load and be 'functional'
>> if it can't talk to the hardware at all. The UMDs won't work at all but
>
> Well this reductio ad absurdum fails I think... :)
>
>> the driver load will be 'fine'. That's a requirement to be able to get
>> the user to a software fallback desktop in order to work out why the
>> hardware isn't working (e.g. no GuC firmware file). I would view this
>> as similar. The KMD might have loaded but the UMDs are not functional.
>> That is definitely an error condition to me.
>
> ... If GuC fails to load there is no command submission and driver will
> obviously log that with drm_err.
>
> If blob fails to verify it depends on the userspace stack what will
> happen. As stated before on some platforms, and/or after a certain time,
> Mesa will not look at the blob at all. So i915 is fine (it is after all
> just a conduit for opaque data!), system overall is fine, so it
> definitely isn't a KERN_ERR level event.
>
>>>>>>> + ERR_PTR(ret));
>>>>>>> +
>>>>>>> ret = guc_enable_communication(guc);
>>>>>>> if (ret)
>>>>>>> goto err_log_capture;
>>>>>>> @@ -562,6 +567,8 @@ static void __uc_fini_hw(struct intel_uc *uc)
>>>>>>> if (intel_uc_uses_guc_submission(uc))
>>>>>>> intel_guc_submission_disable(guc);
>>>>>>> + intel_guc_hwconfig_fini(&guc->hwconfig);
>>>>>>> +
>>>>>>> __uc_sanitize(uc);
>>>>>>> }
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> b/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> index 76e590fcb903..1d31e35a5154 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> @@ -990,6 +990,7 @@ static const struct intel_device_info
>>>>>>> adl_p_info = {
>>>>>>> BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) |
>>>>>>> BIT(VCS2),
>>>>>>> .ppgtt_size = 48,
>>>>>>> .dma_mask_size = 39,
>>>>>>> + .has_guc_hwconfig = 1,
>>>>>> Who requested this change? It was previously done this way but the
>>>>>> instruction was that i915_pci.c is for hardware features only but
>>>>>> that
>>>>>> this, as you seem extremely keen about pointing out at every
>>>>>> opportunity, is a software feature.
>>>>>
>>>>> This was requested by Michal as well. I definitely agree it is a
>>>>> software feature, but I was not aware that "i915_pci.c is for hardware
>>>>> features only".
>>>>>
>>>>> Michal, do you agree with this and returning to the previous method
>>>>> for
>>>>> enabling the feature?
>>>>
>>>> now I'm little confused as some arch direction was to treat FW as
>>>> extension of the HW so for me it was natural to have 'has_guc_hwconfig'
>>>> flag in device_info
>>>>
>>>> if still for some reason it is undesired to mix HW and FW/SW flags
>>>> inside single group of flags then maybe we should just add separate
>>>> group of immutable flags where has_guc_hwconfig could be defined.
>>>>
>>>> let our maintainers decide
>>>
>>> Bah.. :)
>>>
>>> And what was the previous method?
>>>
>>> [comes back later]
>>>
>>> Okay it was:
>>>
>>> +static bool has_table(struct drm_i915_private *i915)
>>> +{
>>> + if (IS_ALDERLAKE_P(i915))
>>> + return true;
>>>
>>> Which sucks a bit if we want to argue it does not belong in device info.
>>>
>>> Why can't we ask the GuC if the blob exists? In fact what would
>>> happen if one would call __guc_action_get_hwconfig on any GuC platform?
>> That was how I originally wrote the code. However, other parties
>> refuse to allow a H2G call to fail. The underlying CTB layers complain
>> loudly on any CTB error. And the GuC architects insist that a call to
>> query the table on an unsupported platform is an error and should
>> return an 'unsupported' error code.
just for the background: IIRC the reason why arch didn't want 'query'
mode that wont fail was that HWconfig is not optional on these selected
platforms, so driver shall know up-front on which platforms it shall be
working (and driver shall even fail if blob is not available)
>
> Oh well, shrug, sounds silly but I will not pretend I am familiar with H2G
>
> In this case has_table does sound better since it indeed isn't a
> hardware feature. It is a GuC fw thing and if we don't have a way to
> probe things there at runtime, then at least limit the knowledge to GuC
> files.
note that arch expectation is that driver itself shall be using this
hwconfig (ie. will decode required data). while current approach is that
driver acts only as a proxy to UMD, this has to change and refactoring
will start (likely) soon. therefore flag has_guc_hwconfig fits better
IMHO as this wont be just 'GuC' fw thing.
Michal
>
> Regards,
>
> Tvrtko
More information about the Intel-gfx
mailing list