[Intel-gfx] Correct DMC version for Skylake (1.23 vs 1.26)

Daniel Klaffenbach danielklaffenbach at gmail.com
Wed Aug 10 18:56:05 UTC 2016


Just FYI: 1.23 with 4.8-rc1 always causes unrecoverable i915 crashes
on my machine as soon as X loads (divide error: 0000). I don't know if
it's just my machine, but there already is a bug report about this on
Bugzilla.

The interesting thing is: patching the kernel to load 1.26 instead of
1.23 resolves the issue for me. So I hope that Rodrigo will be able to
get the patch into mainline until 4.8 hits stable.


- Daniel

2016-08-10 17:56 GMT+02:00 Dave Gordon <david.s.gordon at intel.com>:
> On 10/08/16 11:26, Daniel Vetter wrote:
>>
>> On Tue, Aug 09, 2016 at 10:57:13PM -0700, Rodrigo Vivi wrote:
>>>
>>> On Tue, Aug 9, 2016 at 1:48 AM, Jani Nikula <jani.nikula at linux.intel.com>
>>> wrote:
>>>>
>>>> On Tue, 09 Aug 2016, Daniel Klaffenbach <danielklaffenbach at gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> which one is the correct DMC version to load for Linux 4.8-rc1? The
>>>>> binary blob in linux-firmware.git is v1.26, which is also the latest
>>>>> version available for download at the linuxgraphics website.
>>>>>
>>>>> Version 1.26 used to load just fine on Kernels 4.6 and 4.7. Commit
>>>>> 4aa7fb9c introduced version pinning for v1.26 (both in
>>>>> drm-intel-nightly and the current for-linux-next branch). Later an
>>>>> older commit was pushed (a4a027a8), which lowered the
>>>>> required DMC firmware to v1.23 again, without removing the
>>>>> pinning.
>>>>>
>>>>> Now the situation is that v1.23 is pinned ATM, but v1.26 has been
>>>>> released through linux-firmware.git and is being rolled out to end
>>>>> users right now.
>>>>>
>>>>> What to do now? Is this a bug or a feature?
>>>
>>>
>>> It is a bug, I'm sending a patch right now.
>>>
>>>>
>>>> You should use whichever version the kernel asks. The bug is that v1.23
>>>> was dropped from linux-firmware.
>>>
>>>
>>> 1.23 was intentionally dropped from linux-firmware since 1.26 was
>>> already the required one by our driver.
>>>
>>> Some merge probably failed and overwrote what Patrik had properly done
>>> in commit  4aa7fb9c ("drm/i915/dmc: Step away from symbolic
>>> links")
>>>
>>>> We may later upgrade the firmware the
>>>> kernel asks, and even backport said upgrade to stable kernels after
>>>> ensuring it works.
>>>>
>>>> Rodrigo, please fix linux-firmware.
>>>
>>>
>>> No, I'm going to fix our driver.
>>>
>>> Well, I can restore the 1.23 there if you tell me there is no way we
>>> can make sure this patch that I'm about to send will land and be
>>> backported on time, but this is not the ideal since we know 1.23 will
>>> cause bugs.
>>
>>
>> Backporting right now takes more than 1 month until it's in users hands.
>> We need _both_ because we've screwed up :(
>>
>> Also really, if there's a regression and it's more than 1 week just push
>> the revert, to whichever repo it needs to be pushed to. Here this means
>> reverting on linux-firmware. Talking for weeks about simple bisected
>> regressions is one of the reasons why it's sooooooooo expensive for us to
>> fix bugs, and in turn why we're totally not in control of the situation.
>>
>> So in case of doubt: Revert first, ask questions later pls.
>> -Daniel
>
>
> This might have been noticed a bit more quickly if DMC loading wasn't
> allowed to quietly fail in CI. There's at least one CI machine that's
> missing the correct (latest) version of the DMC firmware, but it doesn't
> trigger a DMESG_WARN status, so nobody noticed. We only register the failure
> when GuC loading is forced so we get an ERROR from the GuC loader, and can
> incidentally see that the DMC load also failed.
>
> [  170.678098] Setting dangerous option enable_psr - tainting kernel
> [  170.714641] i915 0000:00:02.0: Failed to load DMC firmware
> [https://01.org/linuxgraphics/intel-linux-graphics-firmwares], disabling
> runtime power management.
> [  171.701314] i915 0000:00:02.0: Direct firmware load for
> i915/skl_guc_ver6_1.bin failed with error -2
> [  171.701392] [drm:intel_guc_init [i915]] *ERROR* Failed to fetch GuC
> firmware from i915/skl_guc_ver6_1.bin (error -2)
>
> And then hidden away we find out why:
>
> [  170.714639] [drm] Refusing to load DMC firmware v1.26, please use v1.23
> [https://01.org/linuxgraphics/intel-linux-graphics-firmwares].
>
> .Dave.


More information about the Intel-gfx mailing list