[Intel-gfx] [CI 1/2] drm/i915/guc: Fix null pointer dereference when GuC FW is not available

Yaodong Li yaodong.li at intel.com
Fri Mar 23 18:40:10 UTC 2018

On 03/23/2018 11:26 AM, Michal Wajdeczko wrote:
> On Fri, 23 Mar 2018 19:03:47 +0100, Yaodong Li <yaodong.li at intel.com> 
> wrote:
>> On 03/23/2018 05:27 AM, Michal Wajdeczko wrote:
>>> On Fri, 23 Mar 2018 13:07:15 +0100, Sagar Arun Kamble 
>>> <sagar.a.kamble at intel.com> wrote:
>>>> On 3/23/2018 4:53 PM, Piotr Piórkowski wrote:
>>>>> If GuC firmware is not available on the system and we load i915 
>>>>> with enable
>>>>> GuC, then we hit this null pointer dereference issue:
>>>> Patch looks good but I have query on usage of enable_guc modparam.
>>>> enable_guc modparam will enable GuC/HuC only if firmware is available.
>>> During modparam sanitization phase, code will only check for firmware
>>> name, code will attempt to check if firmware file exists.
>> Is it better if we won't even call into upload FW once we found the 
>> FW fetching was failed?
>>>> If user sets to forcefully enable GuC/HuC on systems that don't 
>>>> have firmware then it is user's fault.
>>> Sure its user's fault, but event in such case we should just gracefully
>>> abort driver load, without any NULL pointer BUG ;)
>>> And note, that we will hit this bug not only when user force GuC with:
>>>     enable_guc=1 guc_firmware_path=/does/not/exist
>>> but also when user just specify auto mode:
>>>     enable_guc=-1
>>> when predefined firmwares are not present (not installed or removed)
>> it feels like it's still user's fault even with the auto mode. why 
>> the user even set
>> the enable_guc to -1 after known the fact that the FWs are not present?
>> I mean should users be expected to know the fact that the auto mode
>> enable_guc = -1 will enable the GuC and HuC, if the guc_firmware_path 
>> was left
>> to be none?
> GuC fw path will not be none, as we have some hardcoded paths in the 
> code.
> And we don't tell the user that after turning on 'auto' mode, he has to
> do something special. Only later in case of fw fetch errors, we are just
> printing error message with extra help:
> "%s: Failed to fetch firmware %s (error %d)\n"
> "%s: Firmware can be downloaded from %s\n"
In this case, back to my question: maybe it's a better idea to make 
things stops here
instead of continuing to call following functions such as uc_init, 
uc_init_hw -> fw_upload?
>> I agree we should make sure correct pointer access. it's a 
>> non-excusable fault:)
>> The thing actually frustrated me was that we failed to screen such an 
>> error
>> with the CI, but suddenly it broke things and tests with incorrect 
>> configurations
>> (enable_guc set to 1 or -1 + No FW available). so maybe we should add 
>> a case to
>> cover such a case in CI as well if we did have such test configurations.
> I guess you can write such test on your own and submit patch to 
> igt-dev ;)


More information about the Intel-gfx mailing list