[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Steer MCR reads to lowest potential slice/subslice

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Jun 14 08:48:52 UTC 2021


On 12/06/2021 00:42, Matt Roper wrote:
> On Fri, Jun 11, 2021 at 05:35:53AM +0000, Patchwork wrote:
>> == Series Details ==
>>
>> Series: drm/i915: Steer MCR reads to lowest potential slice/subslice
>> URL   : https://patchwork.freedesktop.org/series/91367/
>> State : failure
>>
>> == Summary ==
>>
>> CI Bug Log - changes from CI_DRM_10206_full -> Patchwork_20340_full
>> ====================================================
>>
>> Summary
>> -------
>>
>>    **FAILURE**
>>
>>    Serious unknown changes coming with Patchwork_20340_full absolutely need to be
>>    verified manually.
>>    
>>    If you think the reported changes have nothing to do with the changes
>>    introduced in Patchwork_20340_full, please notify your bug team to allow them
>>    to document this new failure mode, which will reduce false positives in CI.
>>
>>    
>>
>> Possible new issues
>> -------------------
>>
>>    Here are the unknown changes that may have been introduced in Patchwork_20340_full:
>>
>> ### IGT changes ###
>>
>> #### Possible regressions ####
>>
>>    * igt at gem_ctx_persistence@legacy-engines-mixed-process at bsd:
>>      - shard-iclb:         NOTRUN -> [DMESG-WARN][1]
>>     [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20340/shard-iclb6/igt@gem_ctx_persistence@legacy-engines-mixed-process@bsd.html
>>
>>    * igt at i915_selftest@perf at engine_cs:
>>      - shard-iclb:         [PASS][2] -> [DMESG-WARN][3] +36 similar issues
>>     [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10206/shard-iclb8/igt@i915_selftest@perf@engine_cs.html
>>     [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20340/shard-iclb7/igt@i915_selftest@perf@engine_cs.html
> 
> Steering to the minconfig does seem to have successfully fixed the issue
> on EHL/JSL according to the BAT results.  However the code changes
> uncovered a similar issue on ICL.  From experimenting on ICL, it appears
> that if you don't steer to the minconfig, you can sometimes get random
> garbage (rather than 0's) when render power gating is enabled.  CI
> wasn't flagging a workaround warning on ICL all along only because we
> were reading back random garbage that just happened to have a '1' in the
> relevant bit.
> 
> So the problem now is that the fls() -> ffs() conversion didn't actually
> get us to the minconfig on this ICL system.  Since there are two types
> of multicast registers on gen11 (subslice multicast and l3bank
> multicast), we currently pick our subslice target by &'ing those two
> masks together.  Unfortunately the minconfig subslice may not also be a
> suitable l3bank, so even using ffs() instead of fls() on the
> intersection will give us a "bad" steering ID.
> 
> It looks like there will be cases where we can't just always use the
> same steering value for both the subslice multicast registers and the
> l3bank multicast registers; we'll probably want to steer to the
> minconfig subslice by default and then explicitly re-steer to a valid
> l3bank in cases where we can't find a suitable value for both.  I
> already have some patches that do something similar for steering on
> upcoming platforms, so I'll get those reorganized so that we can use
> them on these platforms as well.

Kudos for figuring this all out! This until now unexplained fls vs ffs 
issue has been annoying us for years.

Regards,

Tvrtko


More information about the Intel-gfx mailing list