renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
Michel Dänzer
michel.daenzer at mailbox.org
Tue Jan 3 18:04:00 UTC 2023
On 12/21/22 23:02, Brian Norris wrote:
> Hi Mark, Sean, (and dri-devel)
>
> On Wed, Dec 14, 2022 at 07:04:37PM -0800, Brian Norris wrote:
>> On Tue, Dec 13, 2022 at 04:51:11PM +0000, Mark Brown wrote:
>>> On Tue, Dec 13, 2022 at 05:56:30AM -0800, KernelCI bot wrote:
>>>
>>> The KernelCI bisection bot found regressions in at least two KMS tests
>>> in the Renesas tree on rk3399-gru-kevin just after the Renesas tree
>>> merged up mainline:
>>>
>>> igt-kms-rockchip.kms_vblank.pipe-A-wait-forked
>>> igt-kms-rockchip.kms_vblank.pipe-A-query-busy
>>>
>>> which it bisected to ca871659ec16 ("drm/bridge: analogix_dp: Support
>>> PSR-exit to disable transition"). I'm not *100%* sure I trust the
>>> bisection but it sure is suspicous that two separate bisects for related
>>> issues landed on the same commit.
>
> ...
>
>>> | IGT-Version: 1.26-gf8a4a0b (aarch64) (Linux: 6.1.0 aarch64)
>>> | <14>[ 35.444448] [IGT] drm_read: starting subtest short-buffer-wakeup
>>> | Starting subtest: short-buffer-wakeup
>>> |
>>> | (| drm_read:350) CRITICAL: Test assertion failure function generate_event, file ../tests/drm_read.c:65:
>>> | (drm_read:350) CRITICAL: <14>[ 36.155642] [IGT] drm_read: exiting, ret=98
>>> | Failed assertion: kmstest_get_vblank(fd, pipe, DRM_VBLANK_EVENT)
>>> |
>>> | (drm_read:350) CRITICAL: Last errno: 22, Invalid argument
>>> | Stack trace:
>>> |
>>> | #0 ../lib/igt_core.c:1933 __igt_fail_assert()
>>> | #1 [<unknown>+0xd5362770]
>>> | #2 [<unknown>+0xd536193c]
>>> | #3 [__libc_start_main+0xe8]
>>> | #4 [<unknown>+0xd5361974]
>>> | #5 [<unknown<6>[ 36.162851] Console: switching to colour frame buffer device 300x100>+0xd5361974]
>>> | Subtest short-buffer-wakeup failed.
>
> ...
>
>> I'll look some more, but this might be a difference of test setup, such
>> that my setup has the issue before and after that commit, but your setup
>> sees a regression.
>
> I believe this is the case; things are broken both before and after, but
> depending on test sequencing, etc., they might have seemed less broken
> before.
>
> When we're failing, we're getting EINVAL from drm_vblank_get(). That
> essentially means that vblank has been disabled (drm_crtc_vblank_off()).
> As it happens, this is exactly what we do when we enter PSR (Panel Self
> Refresh).
>
> Now, these test cases (especially 'kms_vblank.pipe-A-wait-forked') seem
> to display a static image, and then wait for a bunch of vblank events.
> This is exactly the sort of case where we should enter PSR, and so we're
> likely to disable vblank events. Thus, if everything is working right,
> we will often hit EINVAL in ioctl(DRM_IOCTL_WAIT_VBLANK) ... ->
> drm_vblank_get(), and fail the test.
>
> As to why this appears to be a regression: like mentioned in my previous
> mail, these tests tend to hose the Analogix PSR state before my patch
> set. When PSR is hosed, we tend to stay with PSR disabled, and so
> drm_vblank_get() doesn't return EINVAL, and the test is happy. (Never
> mind that the display is likely non-functional.) [0]
>
> So how to fix this?
>
> 1. ignore it; it's "just" a test failure, IIUC [1]
> 2. change the test, to skip this test if the device has PSR
> 3. leave vblank enabled even in the presence of PSR
> 4. otherwise modify the vblank ioctl interface, such that one can "wait"
> for a vblank that won't come (because the display is in self-refresh
> / there are no new frames or vblanks)
FWIW, a couple more alternatives:
5. Go/stay out of PSR while vblank interrupts are enabled (probably want to make sure vblankoffdelay is set up such that vblank interrupts are disabled ASAP)
6. Use fallback timers for vblank events while in PSR (there might be some infrastructure for this, since some drivers don't have real vblank interrupts)
> [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI
> contract in the presence of PSR, but I believe the upstream PSR
> support has always worked this way. I could imagine
> DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here
> though.
Yeah, that's pretty likely to cause issues with existing real-world user space.
--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and Xwayland developer
More information about the dri-devel
mailing list