[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/6] drm/i915/huc: Move firmware selection to init_early

Michal Wajdeczko michal.wajdeczko at intel.com
Thu Nov 30 22:37:41 UTC 2017


On Thu, 30 Nov 2017 22:12:48 +0100, Sagar Arun Kamble  
<sagar.a.kamble at intel.com> wrote:

>
>
> On 12/1/2017 12:40 AM, Chris Wilson wrote:
>> Quoting Patchwork (2017-11-30 18:37:38)
>>> == Series Details ==
>>>
>>> Series: series starting with [1/6] drm/i915/huc: Move firmware  
>>> selection to init_early
>>> URL   : https://patchwork.freedesktop.org/series/34694/
>>> State : failure
>>>
>>> WARNING: Long output truncated
>> What's CI is trying to tell us there at the end is that we have a NULL
>> deref in guc_doorbell_destroy if we fail to load the guc firmware. So we
>> need to double check all the guc_init_submission failure paths.

It looks that I've introduced new failure path when we fail to fetch
HuC firmware. Then intel_uc_fw_upload() will return -EIO (btw I guess
-ENOEXEC would be better there).

> guc_submission_disable in intel_uc_fini_hw is gated by module param  
> still.

This alone is not that bad.
Here problem is that intel_uc_fini_hw() was called regardless of our
earlier unrecoverable failure in uc_init_hw() which I wrongly report
as -EIO (leading to reset/wedged path)

> either we need to migrate to internal submission init state or update  
> enable_guc=0 on guc init
> failure when we have set auto option.

I wanted to avoid changing modparam after sanitize() as modparam should
not be treated as runtime/internal state variable

> Taking sanitization past guc_fetch and making intel_uc_fw_is_selected  
> rely on fetch_status looks

But even if we fetch earlier, in sanitize() we should not attempt to
modify modparam beyond auto mode. And successful fetch status will not
help us when we later fail during fw xfer...

> best option though because if we want to return -EIO then i915 load  
> should mark as device init
> failure which isn't happening currently (leading to reset path failures  
> as well)

We should return proper 'ret' and make sure it is not -EIO to be in
sync with the trailing comment that says 'no fallback'




More information about the Intel-gfx mailing list