[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/guc: fix GuC suspend/resume

Michal Wajdeczko michal.wajdeczko at intel.com
Tue Oct 16 18:58:24 UTC 2018


On Tue, 16 Oct 2018 19:15:19 +0200, Daniele Ceraolo Spurio  
<daniele.ceraolospurio at intel.com> wrote:

>
>
> On 10/16/2018 2:21 AM, Daniel Vetter wrote:
>> On Tue, Oct 16, 2018 at 1:44 AM Daniele Ceraolo Spurio
>> <daniele.ceraolospurio at intel.com> wrote:
>>>
>>>
>>> On 15/10/18 15:47, Patchwork wrote:
>>>> == Series Details ==
>>>>
>>>> Series: series starting with [1/2] drm/i915/guc: fix GuC  
>>>> suspend/resume
>>>> URL   : https://patchwork.freedesktop.org/series/51033/
>>>> State : failure
>>>>
>>>> == Summary ==
>>>>
>>>> = CI Bug Log - changes from CI_DRM_4984 -> Patchwork_10464 =
>>>>
>>>> == Summary - FAILURE ==
>>>>
>>>>     Serious unknown changes coming with Patchwork_10464 absolutely  
>>>> need to be
>>>>     verified manually.
>>>>
>>>>     If you think the reported changes have nothing to do with the  
>>>> changes
>>>>     introduced in Patchwork_10464, please notify your bug team to  
>>>> allow them
>>>>     to document this new failure mode, which will reduce false  
>>>> positives in CI.
>>>>
>>>>     External URL:  
>>>> https://patchwork.freedesktop.org/api/1.0/series/51033/revisions/1/mbox/
>>>>
>>>> == Possible new issues ==
>>>>
>>>>     Here are the unknown changes that may have been introduced in  
>>>> Patchwork_10464:
>>>>
>>>>     === IGT changes ===
>>>>
>>>>       ==== Possible regressions ====
>>>>
>>>>       igt at drv_selftest@live_execlists:
>>>>         fi-skl-6700hq:      PASS -> INCOMPLETE
>>>>
>>> Log seem to be cut for this one. Since it is stopping inside
>>> live_preempt_smoke it is probably a known issue that Chris mentioned.
>>> Can't reproduce on my skylake even with the test in a loop.
>>>
>>>>       igt at drv_selftest@live_guc:
>>>>         fi-kbl-7567u:       PASS -> DMESG-WARN
>>>>         fi-skl-6600u:       PASS -> DMESG-WARN
>>>>         fi-skl-gvtdvm:      PASS -> DMESG-WARN
>>>>         fi-skl-iommu:       PASS -> DMESG-WARN
>>>>         fi-skl-6260u:       PASS -> DMESG-WARN
>>>>         fi-bxt-dsi:         PASS -> DMESG-WARN
>>>>         fi-skl-6700k2:      PASS -> DMESG-WARN
>>>>         fi-whl-u:           PASS -> DMESG-WARN
>>>>         fi-skl-6770hq:      PASS -> DMESG-WARN
>>>>         fi-kbl-7560u:       PASS -> DMESG-WARN
>>>>         fi-kbl-8809g:       PASS -> DMESG-WARN
>>>>         fi-kbl-r:           PASS -> DMESG-WARN
>>>>         fi-kbl-x1275:       PASS -> DMESG-WARN
>>>>         fi-bxt-j4205:       PASS -> DMESG-WARN
>>>>         fi-cfl-s3:          PASS -> DMESG-WARN
>>>>         fi-cfl-8109u:       PASS -> DMESG-WARN
>>>>         fi-kbl-7500u:       PASS -> DMESG-WARN
>>>>         fi-cfl-8700k:       PASS -> DMESG-WARN
>>> These are all:
>>>
>>> [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action 0x10 failed
>>> with error -5 0xf000f000
>>>
>>> Which is not a real failure since the test is triggering it on purpose
>> You still need to make them shut up. dmesg errors should only be used
>> for stuff we really don't expect. E.g. gpu hangs provoked by igt also
>> don't result in dmesg errors/warnings and failed tests.
>> -Daniel
>
> I wasn't trying to imply that we don't care that we have a failure or  
> that we shouldn't make it shut up, just that it is not a regression  
> introduced by this patch, because it doesn't even get near that code. I  
> recall that there was a small discussion in the past about how to  
> silence this, I'll try to dig it up and see if there was an agreed  
> solution.

Preferred solution was to remove negative GuC tests from i915 selftests.

Note that this "low level" error message is our guard that we are always
correctly communicating with the GuC, no silent drop of unexpected GuC  
errors.

GuC negative testing shall be done by the fw team.

Michal

>
> Daniele
>
>>>>       igt at drv_selftest@live_hangcheck:
>>>>         fi-skl-gvtdvm:      PASS -> DMESG-FAIL
>>>>
>>> <7> [464.966238] [drm:guc_fw_xfer [i915]] GuC status 0x20
>>> <3> [464.966361] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer
>>> error -110
>>>
>>> This looks like GuC is stuck very early in the boot flow (even before
>>> the RSA check). On SKL there are known issues that could cause this and
>>> we should reset GuC and retry, but we aren't. Looks like we indirectly
>>> stopped applying  WaEnableuKernelHeaderValidFix and
>>> WaEnableGuCBootHashCheckNotSet by not returning -EAGAIN from
>>> intel_guc_fw_upload in any case. Michal?
>>>
>>> Thanks,
>>> Daniele
>>>
>>>> == Known issues ==
>>>>
>>>>     Here are the changes found in Patchwork_10464 that come from  
>>>> known issues:
>>>>
>>>>     === IGT changes ===
>>>>
>>>>       ==== Issues hit ====
>>>>
>>>>       igt at drv_selftest@live_guc:
>>>>         {fi-apl-guc}:       NOTRUN -> DMESG-WARN (fdo#107258)
>>>>
>>>>       igt at gem_exec_suspend@basic-s4-devices:
>>>>         fi-blb-e6850:       PASS -> INCOMPLETE (fdo#107718)
>>>>
>>>>       igt at kms_pipe_crc_basic@suspend-read-crc-pipe-b:
>>>>         fi-snb-2520m:       PASS -> DMESG-FAIL (fdo#103713)
>>>>
>>>>       igt at kms_setmode@basic-clone-single-crtc:
>>>>         fi-snb-2520m:       PASS -> DMESG-WARN (fdo#103713)
>>>>
>>>>       igt at pm_backlight@basic-brightness:
>>>>         fi-snb-2520m:       PASS -> INCOMPLETE (fdo#103713)
>>>>
>>>>
>>>>       ==== Possible fixes ====
>>>>
>>>>       igt at drv_selftest@live_gem:
>>>>         {fi-apl-guc}:       INCOMPLETE (fdo#106693) -> PASS
>>>>
>>>>       igt at kms_frontbuffer_tracking@basic:
>>>>         fi-byt-clapper:     FAIL (fdo#103167) -> PASS
>>>>
>>>>
>>>>     {name}: This element is suppressed. This means it is ignored when  
>>>> computing
>>>>             the status of the difference (SUCCESS, WARNING, or  
>>>> FAILURE).
>>>>
>>>>     fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
>>>>     fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
>>>>     fdo#106693 https://bugs.freedesktop.org/show_bug.cgi?id=106693
>>>>     fdo#107258 https://bugs.freedesktop.org/show_bug.cgi?id=107258
>>>>     fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718
>>>>
>>>>
>>>> == Participating hosts (53 -> 47) ==
>>>>
>>>>     Missing    (6): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u  
>>>> fi-byt-squawks fi-bsw-cyan fi-ctg-p8600
>>>>
>>>>
>>>> == Build changes ==
>>>>
>>>>       * Linux: CI_DRM_4984 -> Patchwork_10464
>>>>
>>>>     CI_DRM_4984: 90b59df999a13a6405f8d7ece08a69120a9b361a @  
>>>> git://anongit.freedesktop.org/gfx-ci/linux
>>>>     IGT_4678: 9310a1265ceabeec736bdf0a76e1e0357c76c0b1 @  
>>>> git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
>>>>     Patchwork_10464: c88fb110ee8261c636d63f4f6d9fa9440891b3a6 @  
>>>> git://anongit.freedesktop.org/gfx-ci/linux
>>>>
>>>>
>>>> == Linux commits ==
>>>>
>>>> c88fb110ee82 HAX enable GuC for CI
>>>> 4454d4d05ce3 drm/i915/guc: fix GuC suspend/resume
>>>>
>>>> == Logs ==
>>>>
>>>> For more details see:  
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10464/issues.html
>>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>>


More information about the Intel-gfx mailing list