[Intel-gfx] [PATCH] drm/i915/huc: Fix error reported by I915_PARAM_HUC_STATUS

Ye, Tony tony.ye at intel.com
Fri Feb 21 03:56:16 UTC 2020



On 2/12/2020 5:54 AM, Fosha, Robert M wrote:
> 
> 
> On 2/11/20 11:57 AM, Michal Wajdeczko wrote:
>> On Tue, 11 Feb 2020 18:53:05 +0100, Fosha, Robert M 
>> <robert.m.fosha at intel.com> wrote:
>>
>>>
>>>
>>> On 2/4/20 4:43 PM, Ye, Tony wrote:
>>>>
>>>>
>>>> On 1/27/2020 1:41 AM, Michal Wajdeczko wrote:
>>>>> On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson 
>>>>> <chris at chris-wilson.co.uk> wrote:
>>>>>
>>>>>> Quoting Michal Wajdeczko (2020-01-23 15:38:52)
>>>>>>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson
>>>>>>> <chris at chris-wilson.co.uk> wrote:
>>>>>>>
>>>>>>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33)
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote:
>>>>>>> >> >  From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt 
>>>>>>> over i915
>>>>>>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV 
>>>>>>> only
>>>>>>> >> > to indicate lack of HuC hardware and we started to use this 
>>>>>>> error
>>>>>>> >> > also for all other cases when HuC was not in use or supported.
>>>>>>> >> >
>>>>>>> >> > Fix that by relying again on HAS_GT_UC macro, since currently
>>>>>>> >> > used function intel_huc_is_supported() is based on HuC firmware
>>>>>>> >> > support which could be unsupported also due to force disabled
>>>>>>> >> > GuC firmware.
>>>>>>> >> >
>>>>>>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>>>>>>> >> > Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>> >> > Cc: Tony Ye <tony.ye at intel.com>
>>>>>>> >>
>>>>>>> >> Reviewed-by: Daniele Ceraolo Spurio 
>>>>>>> <daniele.ceraolospurio at intel.com>
>>>>>>> >
>>>>>>> > Once upon a time did you (Michal) not argue we should indicate 
>>>>>>> the lack
>>>>>>> > of firmware in the error code? Something like
>>>>>>> >
>>>>>>> > if (!HAS_GT_UC(gt->i915))
>>>>>>> >       return -ENODEV;
>>>>>>> >
>>>>>>> > if (!intel_huc_is_supported(huc))
>>>>>>> >       return -ENOEXEC;
>>>>>>>
>>>>>>> Yes, we discussed this here [1] together with [2] but we didn't
>>>>>>> conclude our discussion due to different opinions on how represent
>>>>>>> some states, in particular "manually disabled" state.
>>>>>>>
>>>>>>> In this patch I just wanted to restore old notation.
>>>>>>>
>>>>>>> But we can start new discussion, here is summary:
>>>>>>>
>>>>>>> ------------------+----------+----------+----------
>>>>>>>   HuC state        | today*   | option A | option B
>>>>>>> ------------------+----------+----------+----------
>>>>>>> no HuC hardware   | -ENODEV  | -ENODEV  | -ENODEV
>>>>>>> GuC fw disabled   |   0      |     0    | -EOPNOTSUPP
>>>>>>> HuC fw disabled   |   0      |     0    | -EOPNOTSUPP
>>>>>>> HuC fw missing    |   0      | -ENOPKG  | -ENOEXEC
>>>>>>> HuC fw error      |   0      | -ENOEXEC | -ENOEXEC
>>>>>>> HuC fw fail       |   0      | -EACCES  |    0
>>>>>>> HuC authenticated |   1      |     1    |    1
>>>>>>> ------------------+----------+----------+----------
>>>>>>
>>>>>> By fw fail, you mean we loaded the firmware (to our knowledge)
>>>>>> correctly, but HUC_STATUS is not reported as valid?
>>>>>>
>>>>>> If so, I support option B. I like the idea of saying
>>>>>> "no HuC" (machine too old)
>>>>>> "no firmware" (user action, or lack thereof)
>>>>>> 0 (fw unhappy)
>>>>>> 1 (fw reports success)
>>>>>>
>>>>>> In between states for failures in fw loading? Not so sure. But I 
>>>>>> can see
>>>>>> the nicety in distinguishing between lack of firmware and some random
>>>>>> failure in loading the firmware (the former being user action 
>>>>>> required
>>>>>> to rectify, command line parameter whatever and the latter being the
>>>>>> firmware file is either invalid or a stray neutrino prevented 
>>>>>> loading).
>>>>>>
>>>>>> Imo the error messages should be about why we cannot probe/trust the
>>>>>> HUC_STATUS register. If everything is setup correctly then the 
>>>>>> returned
>>>>>> value should be from reading the register. I dislike only 
>>>>>> returning 1 if
>>>>>> supported, and converting a valid read of 0 into another error.
>>>>>>
>>>>>> So Option B :)
>>>>>
>>>>> But I'm not sure that option B is consistent in error reporting, as
>>>>> "fw unhappy" is definitely an serious error but is represented as 
>>>>> plain
>>>>> non-error "0" status, while "fw disabled" (user action) is treated 
>>>>> as error
>>>>>
>>>>> ------------------+----------
>>>>>    HuC state       | option B
>>>>> ------------------+----------
>>>>> no HuC hardware   | -ENODEV
>>>>> GuC fw disabled   | -EOPNOTSUPP -> user decision, why error?
>>>>> HuC fw disabled   | -EOPNOTSUPP -> user decision, why error?
>>>>> HuC fw missing    | -ENOEXEC
>>>>> HuC fw error      | -ENOEXEC
>>>>> HuC fw fail       |    0        -> unlikely, but still fw/hw error
>>>>> HuC authenticated |    1
>>>>> ------------------+----------
>>>>>
>>>>> On other hand, option A treats all error conditions as errors, leaving
>>>>> status codes only for normal operations: disabled(0)/authenticated(1):
>>>>>
>>>>> ------------------+----------
>>>>>    HuC state       | option A
>>>>> ------------------+----------
>>>>> no HuC hardware   | -ENODEV  -> you shouldn't ask
>>>>> GuC fw disabled   |     0    -> user decision, not an error
>>>>> HuC fw disabled   |     0    -> user decision, not an error
>>>>> HuC fw missing    | -ENOPKG  -> fw not installed correctly
>>>>> HuC fw error      | -ENOEXEC -> bad/wrong fw
>>>>> HuC fw fail       | -EACCES  -> fw/hw error
>>>>> HuC authenticated |     1
>>>>> ------------------+----------
>>>>
>>>> Vote for Option A.
>>>>
>>>> Regards,
>>>> Tony
>>>>
>>>
>>> Are we ok to move forward on this? Michal, are you working on 
>>> updating the patch?
>>
>> I can update the patch, but I don't know which option to implement:
>> Tony votes for Option A, Chris for Option B, what's your choice?
>>
>> Michal
>>
> 
> I don't have a preference and would trust both Tony and Chris over 
> myself. However, if Tony is using HuC more I would vote for his prefered 
> option.
> 
> -Rob

Option B takes enable_guc=0|1 (user doesn't want to load HuC FW) as 
error, and takes FW fail (failed to xfer or init/auth the fw) as 0. It 
would confuse the user.

Regards,
Tony

> 
>>>
>>> -Rob
>>>
>>>>>
>>>>> But since I'm not an active HuC user, will leave final decision to 
>>>>> others.
>>>>>
>>>>> /Michal
>>>>>
>>>>>
>>>>>>
>>>>>>> Note that all above should be compatible with media driver,
>>>>>>> which explicitly looks for no error and value 1
>>>>>>
>>>>>> Cool.
>>>>>> -Chris
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 


More information about the Intel-gfx mailing list