[Intel-gfx] [PATCH] drm/i915/uc: Extra info notice about FW version mis-match vs overrides

Michal Wajdeczko michal.wajdeczko at intel.com
Sat Dec 7 20:04:01 UTC 2019


On Fri, 06 Dec 2019 22:21:37 +0100, John Harrison  
<John.C.Harrison at intel.com> wrote:

> On 11/21/2019 02:43, Michal Wajdeczko wrote:
>> On Thu, 21 Nov 2019 01:27:23 +0100, <John.C.Harrison at intel.com> wrote:
>>
>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> If a FW override is present then a version mis-match is actually
>>> ignored. The warning message was still being printed, though. Which
>>
>> It wasn't a "warning", just "notice"
>>
>>> could confuse people by implying that the load had failed due to the
>>> mis-match when actually something else had failed.
>>
>> The mis-match still might be a reason why something else failed.
>> If there is possible confusing, it's likely due to lack of or incomplete
>> message from this other failure point. So we should make sure that all
>> failure points correctly indicate the failure reason to avoid confusing.
>> Do you recall what this other confusing failure was?
>
> Sorry, bad commit message comment. The point was that you could  
> successfully load the GuC FW but then something entirely unrelated fails  
> (with or without appropriate error message). However, the first apparent  
> failure in dmesg is the GuC version mis-match. Therefore a user (or even

again, it wasn't error message, just a notice to clearly indicate that
fw version being used is not matching tested and guaranteed configuration.

> developer) might assume that all subsequent issues are caused by the FW  
> mismatch causing the GuC to not load at all and hence not investigate

We fail to fetch mismatched fw only if user is trying to cheat with us
(by renaming mismatched fw blob to expected filename). In case of
override, fetch will continue, but we may fail due to hard to predict
ABI incompatibility/breakage.

> any later messages until this first one is fixed/understood.

If there are any later errors, we have to account that these might
be caused by mismatched fw, otherwise, we would use that other fw
as default one.

>
> So the point is just to avoid people wasting time investigating  
> something that is not actually an error.

Ordinary user is never expected to see this notice.

Users/developers that override fw path must expect this notice.
There is a still risk of potential issues caused by loading
mismatched firmware that will not have proper error message.

>
>
>>
>>>
>>> This patch adds an extra message to say that the mis-match is being
>>> ignored if an override is present.
>>>
>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  
>>> b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>> index 66a30ab7044a..c1ae807b07ae 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
>>> @@ -361,6 +361,9 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw,  
>>> struct drm_i915_private *i915)
>>>              err = -ENOEXEC;
>>>              goto fail;
>>>          }
>>> +
>>> +        dev_notice(dev, "%s firmware %s: Firmware override so  
>>> ignoring version mis-match\n",
>>> +               intel_uc_fw_type_repr(uc_fw->type), uc_fw->path);
>>
>> If you still want to include clear statement about mis-match being  
>> ignored,
>> then maybe instead of adding new message it could be combined with old  
>> one:
>>
> Or maybe just move the mis-match notice into the 'goto fail' section and  
> not print anything at all in the case of an override. On the grounds  
> that if someone is specifying an override then it is almost certainly  
> because the default version is not what they want. So yes, it obviously  
> is going to be mis-match.

You can safely (and silently) override fw according to fw version
compatibility schema: same major and newer minor are OK to use.

This notice message was capturing cases when you try to bend the rules:
use mismatched major (as it is unlikely to work, since it indicates no
backward compatibility on ABI level, otherwise major version should stay
the same) or use of legacy minor (possibly will also not work due to
missing feature that is expected/used by driver).

Michal

>
>
>> if (uc_fw->major_ver_found != uc_fw->major_ver_wanted ||
>>             uc_fw->minor_ver_found < uc_fw->minor_ver_wanted) {
>> -               dev_notice(dev, "%s firmware %s: unexpected version:  
>> %u.%u != %u.%u\n",
>> +               dev_notice(dev, "%s firmware %s: %s: %u.%u != %u.%u\n",
>>                            intel_uc_fw_type_repr(uc_fw->type),  
>> uc_fw->path,
>> +                          intel_uc_fw_is_overridden(uc_fw) ?
>> +                          "ignoring unexpected version" : "wrong  
>> version",
>>                            uc_fw->major_ver_found,  
>> uc_fw->minor_ver_found,
>>                            uc_fw->major_ver_wanted,  
>> uc_fw->minor_ver_wanted);
>>                 if (!intel_uc_fw_is_overridden(uc_fw)) {
>>
>> Michal


More information about the Intel-gfx mailing list