[Intel-gfx] [PATCH v5 1/4] drm/i915/guc: Add fetch of hwconfig table

Michal Wajdeczko michal.wajdeczko at intel.com
Fri Feb 25 18:05:31 UTC 2022



On 25.02.2022 18:18, Tvrtko Ursulin wrote:
> 
> On 25/02/2022 16:46, John Harrison wrote:
> 
>>>> driver we don't care that much that we failed to load HWconfig and
>>>> 'notice' is enough.
>>>>
>>>> but I'm fine with all messages being drm_err (as we will not have to
>>>> change that once again after HWconfig will be mandatory for the driver
>>>> as well)
>>>
>>> I would be against drm_err.
>>>
>>> #define KERN_EMERG      KERN_SOH "0"    /* system is unusable */
>>> #define KERN_ALERT      KERN_SOH "1"    /* action must be taken
>>> immediately */
>>> #define KERN_CRIT       KERN_SOH "2"    /* critical conditions */
>>> #define KERN_ERR        KERN_SOH "3"    /* error conditions */
>>> #define KERN_WARNING    KERN_SOH "4"    /* warning conditions */
>>> #define KERN_NOTICE     KERN_SOH "5"    /* normal but significant
>>> condition */
>>> #define KERN_INFO       KERN_SOH "6"    /* informational */
>>> #define KERN_DEBUG      KERN_SOH "7"    /* debug-level messages */
>>>
>>> From the point of view of the kernel driver, this is not an error to
>>> its operation. It can at most be a warning, but notice is also fine
>>> by me. One could argue when reading "normal but significant
>>> condition" that it is not normal, when it is in fact unexpected, so
>>> if people prefer warning that is also okay by me. I still lean
>>> towards notice becuase of the hands-off nature i915 has with the
>>> pass-through of this blob.
>>  From the point of view of the KMD, i915 will load and be 'functional'
>> if it can't talk to the hardware at all. The UMDs won't work at all but 
> 
> Well this reductio ad absurdum fails I think... :)
> 
>> the driver load will be 'fine'. That's a requirement to be able to get
>> the user to a software fallback desktop in order to work out why the
>> hardware isn't working (e.g. no GuC firmware file). I would view this
>> as similar. The KMD might have loaded but the UMDs are not functional.
>> That is definitely an error condition to me.
> 
> ... If GuC fails to load there is no command submission and driver will
> obviously log that with drm_err.
> 
> If blob fails to verify it depends on the userspace stack what will
> happen. As stated before on some platforms, and/or after a certain time,
> Mesa will not look at the blob at all. So i915 is fine (it is after all
> just a conduit for opaque data!), system overall is fine, so it
> definitely isn't a KERN_ERR level event.
> 
>>>>>>> +               ERR_PTR(ret));
>>>>>>> +
>>>>>>>        ret = guc_enable_communication(guc);
>>>>>>>        if (ret)
>>>>>>>            goto err_log_capture;
>>>>>>> @@ -562,6 +567,8 @@ static void __uc_fini_hw(struct intel_uc *uc)
>>>>>>>        if (intel_uc_uses_guc_submission(uc))
>>>>>>>            intel_guc_submission_disable(guc);
>>>>>>>    +    intel_guc_hwconfig_fini(&guc->hwconfig);
>>>>>>> +
>>>>>>>        __uc_sanitize(uc);
>>>>>>>    }
>>>>>>>    diff --git a/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> b/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> index 76e590fcb903..1d31e35a5154 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>>>>> @@ -990,6 +990,7 @@ static const struct intel_device_info
>>>>>>> adl_p_info = {
>>>>>>>            BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) |
>>>>>>> BIT(VCS2),
>>>>>>>        .ppgtt_size = 48,
>>>>>>>        .dma_mask_size = 39,
>>>>>>> +    .has_guc_hwconfig = 1,
>>>>>> Who requested this change? It was previously done this way but the
>>>>>> instruction was that i915_pci.c is for hardware features only but
>>>>>> that
>>>>>> this, as you seem extremely keen about pointing out at every
>>>>>> opportunity, is a software feature.
>>>>>
>>>>> This was requested by Michal as well. I definitely agree it is a
>>>>> software feature, but I was not aware that "i915_pci.c is for hardware
>>>>> features only".
>>>>>
>>>>> Michal, do you agree with this and returning to the previous method
>>>>> for
>>>>> enabling the feature?
>>>>
>>>> now I'm little confused as some arch direction was to treat FW as
>>>> extension of the HW so for me it was natural to have 'has_guc_hwconfig'
>>>> flag in device_info
>>>>
>>>> if still for some reason it is undesired to mix HW and FW/SW flags
>>>> inside single group of flags then maybe we should just add separate
>>>> group of immutable flags where has_guc_hwconfig could be defined.
>>>>
>>>> let our maintainers decide
>>>
>>> Bah.. :)
>>>
>>> And what was the previous method?
>>>
>>> [comes back later]
>>>
>>> Okay it was:
>>>
>>> +static bool has_table(struct drm_i915_private *i915)
>>> +{
>>> +    if (IS_ALDERLAKE_P(i915))
>>> +        return true;
>>>
>>> Which sucks a bit if we want to argue it does not belong in device info.
>>>
>>> Why can't we ask the GuC if the blob exists? In fact what would
>>> happen if one would call __guc_action_get_hwconfig on any GuC platform?
>> That was how I originally wrote the code. However, other parties
>> refuse to allow a H2G call to fail. The underlying CTB layers complain
>> loudly on any CTB error. And the GuC architects insist that a call to
>> query the table on an unsupported platform is an error and should
>> return an 'unsupported' error code.

just for the background: IIRC the reason why arch didn't want 'query'
mode that wont fail was that HWconfig is not optional on these selected
platforms, so driver shall know up-front on which platforms it shall be
working (and driver shall even fail if blob is not available)

> 
> Oh well, shrug, sounds silly but I will not pretend I am familiar with H2G
> 
> In this case has_table does sound better since it indeed isn't a
> hardware feature. It is a GuC fw thing and if we don't have a way to
> probe things there at runtime, then at least limit the knowledge to GuC
> files.

note that arch expectation is that driver itself shall be using this
hwconfig (ie. will decode required data). while current approach is that
driver acts only as a proxy to UMD, this has to change and refactoring
will start (likely) soon. therefore flag has_guc_hwconfig fits better
IMHO as this wont be just 'GuC' fw thing.

Michal

> 
> Regards,
> 
> Tvrtko


More information about the Intel-gfx mailing list