[Intel-gfx] [PATCH v5 1/4] drm/i915/guc: Add fetch of hwconfig table
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Feb 25 17:18:51 UTC 2022
On 25/02/2022 16:46, John Harrison wrote:
>>> driver we don't care that much that we failed to load HWconfig and
>>> 'notice' is enough.
>>>
>>> but I'm fine with all messages being drm_err (as we will not have to
>>> change that once again after HWconfig will be mandatory for the driver
>>> as well)
>>
>> I would be against drm_err.
>>
>> #define KERN_EMERG KERN_SOH "0" /* system is unusable */
>> #define KERN_ALERT KERN_SOH "1" /* action must be taken
>> immediately */
>> #define KERN_CRIT KERN_SOH "2" /* critical conditions */
>> #define KERN_ERR KERN_SOH "3" /* error conditions */
>> #define KERN_WARNING KERN_SOH "4" /* warning conditions */
>> #define KERN_NOTICE KERN_SOH "5" /* normal but significant
>> condition */
>> #define KERN_INFO KERN_SOH "6" /* informational */
>> #define KERN_DEBUG KERN_SOH "7" /* debug-level messages */
>>
>> From the point of view of the kernel driver, this is not an error to
>> its operation. It can at most be a warning, but notice is also fine by
>> me. One could argue when reading "normal but significant condition"
>> that it is not normal, when it is in fact unexpected, so if people
>> prefer warning that is also okay by me. I still lean towards notice
>> becuase of the hands-off nature i915 has with the pass-through of this
>> blob.
> From the point of view of the KMD, i915 will load and be 'functional'
> if it can't talk to the hardware at all. The UMDs won't work at all but
Well this reductio ad absurdum fails I think... :)
> the driver load will be 'fine'. That's a requirement to be able to get
> the user to a software fallback desktop in order to work out why the
> hardware isn't working (e.g. no GuC firmware file). I would view this as
> similar. The KMD might have loaded but the UMDs are not functional. That
> is definitely an error condition to me.
... If GuC fails to load there is no command submission and driver will
obviously log that with drm_err.
If blob fails to verify it depends on the userspace stack what will
happen. As stated before on some platforms, and/or after a certain time,
Mesa will not look at the blob at all. So i915 is fine (it is after all
just a conduit for opaque data!), system overall is fine, so it
definitely isn't a KERN_ERR level event.
>>>>>> + ERR_PTR(ret));
>>>>>> +
>>>>>> ret = guc_enable_communication(guc);
>>>>>> if (ret)
>>>>>> goto err_log_capture;
>>>>>> @@ -562,6 +567,8 @@ static void __uc_fini_hw(struct intel_uc *uc)
>>>>>> if (intel_uc_uses_guc_submission(uc))
>>>>>> intel_guc_submission_disable(guc);
>>>>>> + intel_guc_hwconfig_fini(&guc->hwconfig);
>>>>>> +
>>>>>> __uc_sanitize(uc);
>>>>>> }
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c
>>>>>> b/drivers/gpu/drm/i915/i915_pci.c
>>>>>> index 76e590fcb903..1d31e35a5154 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>>>>>> @@ -990,6 +990,7 @@ static const struct intel_device_info
>>>>>> adl_p_info = {
>>>>>> BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) |
>>>>>> BIT(VCS2),
>>>>>> .ppgtt_size = 48,
>>>>>> .dma_mask_size = 39,
>>>>>> + .has_guc_hwconfig = 1,
>>>>> Who requested this change? It was previously done this way but the
>>>>> instruction was that i915_pci.c is for hardware features only but that
>>>>> this, as you seem extremely keen about pointing out at every
>>>>> opportunity, is a software feature.
>>>>
>>>> This was requested by Michal as well. I definitely agree it is a
>>>> software feature, but I was not aware that "i915_pci.c is for hardware
>>>> features only".
>>>>
>>>> Michal, do you agree with this and returning to the previous method for
>>>> enabling the feature?
>>>
>>> now I'm little confused as some arch direction was to treat FW as
>>> extension of the HW so for me it was natural to have 'has_guc_hwconfig'
>>> flag in device_info
>>>
>>> if still for some reason it is undesired to mix HW and FW/SW flags
>>> inside single group of flags then maybe we should just add separate
>>> group of immutable flags where has_guc_hwconfig could be defined.
>>>
>>> let our maintainers decide
>>
>> Bah.. :)
>>
>> And what was the previous method?
>>
>> [comes back later]
>>
>> Okay it was:
>>
>> +static bool has_table(struct drm_i915_private *i915)
>> +{
>> + if (IS_ALDERLAKE_P(i915))
>> + return true;
>>
>> Which sucks a bit if we want to argue it does not belong in device info.
>>
>> Why can't we ask the GuC if the blob exists? In fact what would happen
>> if one would call __guc_action_get_hwconfig on any GuC platform?
> That was how I originally wrote the code. However, other parties refuse
> to allow a H2G call to fail. The underlying CTB layers complain loudly
> on any CTB error. And the GuC architects insist that a call to query the
> table on an unsupported platform is an error and should return an
> 'unsupported' error code.
Oh well, shrug, sounds silly but I will not pretend I am familiar with H2G
In this case has_table does sound better since it indeed isn't a
hardware feature. It is a GuC fw thing and if we don't have a way to
probe things there at runtime, then at least limit the knowledge to GuC
files.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list