[Bug 103339] [BAT] igt at pm_rpm@basic-pci-d3-state - incomplete - timeout/system hang

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Fri Oct 27 06:54:06 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=103339

--- Comment #3 from Marta Löfstedt <marta.lofstedt at intel.com> ---
(In reply to Rodrigo Vivi from comment #2)
> Was this one time thing randomly or is this happening always now?
> 
It is random.

Here is a description on how you can retrieve this information. So, you don't
have to wait for a reply in the future. 

>From Intel GFX CI top page: https://intel-gfx-ci.01.org/ 
The CNL machine is in farm1 running the IGT fastfeedback testslist. So, to get
a visual overview of the current issues checkout DRM-Tip - Fast: issues: 
https://intel-gfx-ci.01.org/tree/drm-tip/
we are looking for fi-cnl-y, when I just check the last 5 run has been green
so, yes this issue appear to be random. Now if you click the fi-cnl-y label,
you will see a longer history of the result for fi-cnl-y. Here I see 5 purple
spots indicating the incomplete test results. If you click those you'll get to
the piglit generated result pages:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3218/fi-cnl-y/igt@gem_ctx_switch@basic-default.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3271/fi-cnl-y/igt@gem_exec_flush@basic-batch-kernel-default-uc.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3259/fi-cnl-y/igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3283/fi-cnl-y/igt@kms_pipe_crc_basic@read-crc-pipe-b.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3258/fi-cnl-y/igt@pm_rpm@basic-pci-d3-state.html

The first 2 have pstore generated panic/oops logs. In this case a panic was
generated from a BUG_ON, this issue issue is filed on bug 102035 and not on
this bug.

The last 3 doesn't have any logs, so the machine ended up in a state that
either none of our watchdog systems could trigger, i.e. system hang or the run
was timed out externally by the Jenkins system. When I found these issues I
file a new bug for each machine, and when I get a new occurrence in cibuglog I
will open the bug and add the new information. However, in this case it looks
like I have missed the
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3259/fi-cnl-y/igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence.html.
But, if we look at the dmesg from that run:
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3259/fi-cnl-y/dmesg0.log the
last message is: <2>[  471.997559] watchdog: watchdog0: watchdog did not stop!
This we believe is due to a flaw in our softdog deamon owatch, and is filed on
bug 102332.

You can also got a lot of information from the cibuglog page:
https://intel-gfx-ci.01.org/cibuglog/
If you are interested in fi-cnl-y, find a link for it in the Affected machines
column:
https://intel-gfx-ci.01.org/cibuglog/index.html%3Faction_failures_history=-1&failures_machine=fi-cnl-y.html
and you will see all issues that has occurred since CI_DRM_3121.
Note, once a failing test has been reported on a machine and bug. The following
results from this test machine combination is suppressed from both affecting
pre-merge results and new reporting to cibuglog, i.e. I will not see if for
example an incomplete is reproduced or not. If you look at the entry of this
bug in cibuglog, it currently say 2 / 23 runs (9 %) in the Failure rate column.
In this case it is true, this issue has only happened twice, however if one of
the reporeted test had failed instead of incomplete:ed there would ne 3 / 23
runs. Martin is currently fixing this so that the reason for fail on a test
should matter.

> I believe it is the same I'm starting to see with testdisplay with multiple
> monitors connected.
> 
> do you have more than 1 monitor plugged on this cnl?
>From our top page there is a link to our hardware descrition page:
https://intel-gfx-ci.01.org/tree/drm-tip/hardware.html

fi-cnl-y        Intel Cannonlake-Y RVP  Cannonlake              eDP, (DP, HDMI)

So, this machine only has an eDP panel connected. Parentheses show non
connected options. 

> 
> if this is always happening now, do we have a good day / a good bisect point?
> 
> Thanks,
> Rodrigo.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20171027/48991046/attachment-0001.html>


More information about the intel-gfx-bugs mailing list