[Intel-gfx] Possible 4.5 i915 Skylake regression
Andy Lutomirski
luto at amacapital.net
Sat Mar 12 02:16:22 UTC 2016
On Mon, Feb 22, 2016 at 7:13 PM, Andy Lutomirski <luto at amacapital.net> wrote:
> On Wed, Feb 17, 2016 at 5:36 PM, Andy Lutomirski <luto at amacapital.net> wrote:
>> On Wed, Feb 17, 2016 at 8:18 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
>>> On Tue, Feb 16, 2016 at 09:26:35AM -0800, Andy Lutomirski wrote:
>>>> On Tue, Feb 16, 2016 at 9:12 AM, Andy Lutomirski <luto at amacapital.net> wrote:
>>>> > On Tue, Feb 16, 2016 at 8:12 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
>>>> >> On Mon, Feb 15, 2016 at 06:58:33AM -0800, Andy Lutomirski wrote:
>>>> >>> On Sun, Feb 14, 2016 at 6:59 PM, Andy Lutomirski <luto at kernel.org> wrote:
>>>> >>> > Hi-
>>>> >>> >
>>>> >>> > On 4.5-rc3 on a Dell XPS 13 9350 (Skylake i915, no nvidia on this
>>>> >>> > model), shortly after resume, I saw a single black flash on the
>>>> >>> > screen. The log said:
>>>> >>> >
>>>> >>> > [Feb13 07:05] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR*
>>>> >>> > CPU pipe A FIFO underrun
>>>> >>> >
>>>> >>> > I haven't seen this on 4.4.
>>>> >>> >
>>>> >>> > I'd be happy to dig up debugging info, but I don't know what would be
>>>> >>> > useful. I have no i915 module options set.
>>>> >>>
>>>> >>> It's flashing quite frequently now, although I seem to get the
>>>> >>> underrun warning only once per resume.
>>>> >>
>>>> >> We shut up the warning irq source to avoid hijacking an entire cpu core
>>>> >> ;-)
>>>> >>
>>>> >> There's a fix from Matt right after 4.5-rc4 in Linus' branch. I'm hoping
>>>> >> that should help.
>>>> >
>>>> > Do you mean:
>>>> >
>>>> > commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
>>>> > Author: Matt Roper <matthew.d.roper at intel.com>
>>>> > Date: Mon Feb 8 11:05:28 2016 -0800
>>>> >
>>>> > drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2)
>>>> >
>>>> > If so, it didn't help. I'm currently doing a full rebuild just in
>>>> > case I messed something up, though.
>>>> >
>>>>
>>>> Definitely not fixed. It seems to be okay after a reboot until the
>>>> first suspend/resume.
>>>>
>>>> This happened after resuming. Five cents says it's the root cause.
>>>
>>> That's interesting, but doesn't ring a bell unfortunately. Can you try to
>>> attempt a bisect?
>>
>> I probably can, but it's very slow. Is there a reasonably
>> straightforward way to instrument the watermark computation to see
>> what's going wrong? I'm reasonably confident that the bug is in the
>> resume code or in something that only happens on resume, since I still
>> haven't seen underruns after rebooting before suspending.
>>
>
> With some instrumentation applied, I got this:
>
> [ 369.471064] skl_update_wm(crtc-0): computed update
> [ 369.471072] skl_update_other_pipe_wm(crtc-0): no change
> [ 369.471075] skl_write_wm_values...
> [ 369.471078] CRTC crtc-0 pipe A
> [ 369.471083] wm_linetime = 121
> [ 369.471086] plane_wm level 0 plane 0 = 2147500036
> [ 369.471090] plane_wm level 0 plane 1 = 0
> [ 369.471094] plane_wm level 0 cursor = 2147500036
> [ 369.471097] plane_wm level 1 plane 0 = 2147516439
> [ 369.471101] plane_wm level 1 plane 1 = 0
> [ 369.471104] plane_wm level 1 cursor = 2147516439
> [ 369.471108] plane_wm level 2 plane 0 = 2147516448
> [ 369.471111] plane_wm level 2 plane 1 = 0
> [ 369.471115] plane_wm level 2 cursor = 0
> [ 369.471118] plane_wm level 3 plane 0 = 2147532837
> [ 369.471121] plane_wm level 3 plane 1 = 0
> [ 369.471125] plane_wm level 3 cursor = 0
> [ 369.471128] plane_wm level 4 plane 0 = 2147565639
> [ 369.471131] plane_wm level 4 plane 1 = 0
> [ 369.471135] plane_wm level 4 cursor = 0
> [ 369.471138] plane_wm level 5 plane 0 = 2147582038
> [ 369.471141] plane_wm level 5 plane 1 = 0
> [ 369.471145] plane_wm level 5 cursor = 0
> [ 369.471148] plane_wm level 6 plane 0 = 2147582044
> [ 369.471151] plane_wm level 6 plane 1 = 0
> [ 369.471155] plane_wm level 6 cursor = 0
> [ 369.471158] plane_wm level 7 plane 0 = 2147598443
> [ 369.471161] plane_wm level 7 plane 1 = 0
> [ 369.471164] plane_wm level 7 cursor = 0
> [ 369.471168] wm_trans plane 0 = 0
> [ 369.471171] wm_trans plane 1 = 0
> [ 369.471174] wm_trans cursor = 0
> [ 369.471182] CRTC crtc-1 pipe B
> [ 369.471184] clean
> [ 369.471186] CRTC crtc-2 pipe C
> [ 369.471189] clean
> [ 369.471226] skl_update_wm(crtc-0): no update
> [ 372.068755] [drm:intel_cpu_fifo_underrun_irq_handler [i915]]
> *ERROR* CPU pipe A FIFO underrun
> [ 373.399396] skl_update_wm(crtc-0): computed update
> [ 373.399408] skl_update_other_pipe_wm(crtc-0): no change
> [ 373.399413] skl_write_wm_values...
> [ 373.399418] CRTC crtc-0 pipe A
> [ 373.399426] wm_linetime = 121
> [ 373.399431] plane_wm level 0 plane 0 = 2147500036
> [ 373.399438] plane_wm level 0 plane 1 = 0
> [ 373.399443] plane_wm level 0 cursor = 16388
> [ 373.399449] plane_wm level 1 plane 0 = 2147516439
> [ 373.399455] plane_wm level 1 plane 1 = 0
> [ 373.399460] plane_wm level 1 cursor = 32791
> [ 373.399465] plane_wm level 2 plane 0 = 2147516448
> [ 373.399471] plane_wm level 2 plane 1 = 0
> [ 373.399476] plane_wm level 2 cursor = 0
> [ 373.399481] plane_wm level 3 plane 0 = 2147532837
> [ 373.399486] plane_wm level 3 plane 1 = 0
> [ 373.399491] plane_wm level 3 cursor = 0
> [ 373.399497] plane_wm level 4 plane 0 = 2147565639
> [ 373.399502] plane_wm level 4 plane 1 = 0
> [ 373.399557] plane_wm level 4 cursor = 0
> [ 373.399562] plane_wm level 5 plane 0 = 2147582038
> [ 373.399568] plane_wm level 5 plane 1 = 0
> [ 373.399573] plane_wm level 5 cursor = 0
> [ 373.399578] plane_wm level 6 plane 0 = 2147582044
> [ 373.399591] plane_wm level 6 plane 1 = 0
> [ 373.399596] plane_wm level 6 cursor = 0
> [ 373.399602] plane_wm level 7 plane 0 = 2147598443
> [ 373.399607] plane_wm level 7 plane 1 = 0
> [ 373.399612] plane_wm level 7 cursor = 0
> [ 373.399617] wm_trans plane 0 = 0
> [ 373.399623] wm_trans plane 1 = 0
> [ 373.399627] wm_trans cursor = 0
> [ 373.399638] CRTC crtc-1 pipe B
> [ 373.399642] clean
> [ 373.399646] CRTC crtc-2 pipe C
> [ 373.399650] clean
>
> The diff between those two dumps was:
>
> --- a.txt 2016-02-22 18:56:32.613058614 -0800
> +++ b.txt 2016-02-22 18:56:49.219079057 -0800
> @@ -3,10 +3,10 @@
> wm_linetime = 121
> plane_wm level 0 plane 0 = 2147500036
> plane_wm level 0 plane 1 = 0
> - plane_wm level 0 cursor = 2147500036
> + plane_wm level 0 cursor = 16388
> plane_wm level 1 plane 0 = 2147516439
> plane_wm level 1 plane 1 = 0
> - plane_wm level 1 cursor = 2147516439
> + plane_wm level 1 cursor = 32791
> plane_wm level 2 plane 0 = 2147516448
> plane_wm level 2 plane 1 = 0
> plane_wm level 2 cursor = 0
>
>
> The code that generated this is attached.
>
> Is the fix from Matt that you mentioned this one:
>
> commit e2e407dc093f530b771ee8bf8fe1be41e3cea8b3
> Author: Matt Roper <matthew.d.roper at intel.com>
> Date: Mon Feb 8 11:05:28 2016 -0800
>
> drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2)
>
> On brief inspection, it doesn't look like that patch will have any
> effect on Skylake.
>
Re-ping.
This is still broken AFAICT, and it's also not fixed in drm-intel-nightly.
More information about the Intel-gfx
mailing list