[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Set PROBE_PREFER_ASYNCHRONOUS

Matthew Auld matthew.william.auld at gmail.com
Fri Nov 4 15:20:55 UTC 2022


On Thu, 3 Nov 2022 at 00:14, Brian Norris <briannorris at chromium.org> wrote:
>
> On Wed, Nov 02, 2022 at 12:18:37PM +0000, Matthew Auld wrote:
> > On Tue, 1 Nov 2022 at 21:58, Brian Norris <briannorris at chromium.org> wrote:
> > >
> > > On Fri, Oct 28, 2022 at 5:24 PM Patchwork
> > > <patchwork at emeril.freedesktop.org> wrote:
> > > >
> > > > Patch Details
> > > > Series:drm/i915: Set PROBE_PREFER_ASYNCHRONOUS
> > > > URL:https://patchwork.freedesktop.org/series/110277/
> > > > State:failure
> > > > Details:https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
> > > >
> > > > CI Bug Log - changes from CI_DRM_12317 -> Patchwork_110277v1
> > > >
> > > > Summary
> > > >
> > > > FAILURE
> > > >
> > > > Serious unknown changes coming with Patchwork_110277v1 absolutely need to be
> > > > verified manually.
> > > >
> > > > If you think the reported changes have nothing to do with the changes
> > > > introduced in Patchwork_110277v1, please notify your bug team to allow them
> > > > to document this new failure mode, which will reduce false positives in CI.
> > > >
> > > > External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
> > >
> > > For the record, I have almost zero idea what to do with this. From
> > > what I can tell, most (all?) of these failures are flaky(?) already
> > > and are probably not related to my change.
> >
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_110277v1/index.html
> >
> > According to that link, this change appears to break every platform
> > when running the live selftests (looking at the purple squares).
> > Running the selftests normally involves loading and unloading the
> > module. Looking at the logs there is scary stuff like:
> >
> [...]
>
> Ah, thanks. I'm not sure what made me think the tests were failing the
> same way on drm-tip, but maybe just chalk that up to my unfamiliarity
> with this particular dashboard... (There are a few isolated failure
> and/or flakes on drm-tip, but they don't look like this.)
>
> Anyway, I think I managed to run some of these tests on my own platforms
> [1], and I don't reproduce those failures. I do see other failures
> (crashes) though, like in i915_gem_mman_live_selftests/igt_mmap, where
> igt_mmap_offset() (selftest-only code) -> vm_mmap() assumes we have a
> valid |current->mm|. But that's borrowing the modprobe process's memory
> map, and with async probe, the selftest sequence happens in a kernel
> worker instead (and current->mm is NULL). So that clearly won't work.

Semi related:
https://lore.kernel.org/intel-gfx/20221104134703.3770b371@maurocar-mobl2/T/#m888972bb1ffb0a913e3db8b4099dffdc2ec7a0dc

Sounds like a similar issue when trying to convert the live selftests
over to kunit.

>
> I suppose I could disable async probe when built as a module (I believe
> it doesn't really have any value, since the module load task just waits
> for the async task anyway). I'm not familiar enough with MM to know what
> the vm_mmap() alternatives are, but this particular bit of code does
> feel odd.
>
> Additionally, I think this implies that live_selftests will break if
> i915 is built-in (i.e., =y, not =m), as we'll again run in a
> kernel-thread context at boot time. But I would hope nobody is trying to
> run them that way? I guess this gets even hairier, because even if the
> driver is built into the kernel, it's possible to kick them off from a
> process context by tweaking the module parameters later, and then
> re-binding the device... So all in all, this bug leaves an ugly
> situation, with or without my patch.
>
> I'm still curious about the reported failures, but maybe they require
> some particular sequence of tests? I also don't have the full
> igt-gpu-tools set running, so maybe they do something a little
> differently than my steps in [1]?
>
> Brian
>
> [1] I have a GLk system, if it matters. I figured I can run some of
> these with any one of the following:
>
>   modprobe i915 live_selftests=1
>   modprobe i915 live_selftests=1 igt__20__live_workarounds=Y
>   modprobe i915 live_selftests=1 igt__19__live_uncore=Y
>   modprobe i915 live_selftests=1 igt__18__live_sanitycheck=Y
>   ...


More information about the Intel-gfx mailing list