[Intel-gfx] ✗ Fi.CI.IGT: warning for series starting with [1/3] drm/i915: Check if the stolen memory "reserved" area is enabled or not

Fri Nov 3 11:14:28 UTC 2017

On 03/11/17 12:41, Ville Syrjälä wrote:
> On Fri, Nov 03, 2017 at 10:18:32AM +0200, Tomi Sarvela wrote:
>> On 02/11/17 19:08, Ville Syrjälä wrote:
>>> On Thu, Nov 02, 2017 at 04:34:26PM -0000, Patchwork wrote:
>>>>
>>>> Test kms_busy:
>>>>           Subgroup extended-modeset-hang-oldfb-with-reset-render-A:
>>>>                   dmesg-warn -> PASS       (shard-hsw)
>>>>           Subgroup extended-modeset-hang-newfb-with-reset-render-B:
>>>>                   pass       -> DMESG-WARN (shard-hsw)
>>>
>>> Hmm. The warn was there already AFAICS. I wonder why this is claiming
>>> things were passing?
>>
>> The sharded result for the run is 'warning'. What do you mean?
> 
> This should read 'DMESG-WARN -> DMESG-WARN' rather than 'pass ->
> DMESG-WARN'. If I click the link to open up the results in the browser
> this test isn't shown on the shards.html, but I can see it as
> orange->orange in shards-all.html.

I'll see what has happened there. Clearly a bug.

>>> Also shard-glkb didn't seem to get any results from this run. No idea
>>> why, nor why this summary fails to mention that fact.
>>
>> One GLK died, and 4 GLKs take nearly 90 minutes to run shards. It's not
>> reasonable to wait for them on every run and queue everything else, so
>> Patchwork/Trybot runs are skipped on GLK. Anything can be run and added
>> to results manually, if needed.
> 
> If they aren't being run, then maybe they shouldn't be part of the
> shards.html for pw/trybot runs? Would save me having to wonder why I'm
> getting empty results. Or we should have some kind of indication why we
> didn't get any results for a particular machine (ie. whether it was
> expected or not).

Because of people, we can't always be sure if missing machine is 
expected or not. There is no hard reservation system, and machines are 
taken out of the farm for debugging, bios upgrades, that kind of stuff.

Dropping shard from shards.html if there's no comparable results 
(missing either baseline, or patchset run) would probably be a good thing.

>> As a comparison, SNB/HSW/APL all run shards in ~35 minutes. KBL is the
>> only one that gets rebooted between shards (due to leaking context), and
>> takes around 50 minutes.
>>
>>> Oh and BTW the boot/dmesg links from the shard results don't seem to
>>> work very well. Sometimes it just gets you an empty log and you have
>>> to manually find a file that has some actual content in it.
>>
>> If there is no bootlog for a run, the host has not booted between runs.
>> It's maybe not intuitively clear. I've tried to communicate that the
>> shard runs are shifting to conditional reboots (booted only if hung).
> 
> I don't know how that relates to the several dmesg/boot logs I see
> in the directory. And why the web links often seems to point to the
> empty files instead of ones with actual content in them.

I could link to the original bootlog, but that doesn't answer question 
"what happened just before this shard was run"

Sometimes there's leftover dmesg between shardruns, which gets recorded 
in boot.log. Earlier name was "dmesg_before_run.log" or something like that.

> Also the files are numbered in some way. Whether there's any significance
> to those numbers I can't really tell.

Shards are fed to testhosts from 0 to 34, linear order. Filenames have 
that run number. On one host smaller number has been run before larger 
number. On shardruns the number is used to choose shard testlist in our 
IGT package (./shards/x0000)

We could also do several runs with same kernel and testlist, but 
cibuglog can't yet handle the statistics for these kind of results.

>>> After going through the dmesgs for all the other machines we have in ci,
>>> it doesn't look like there were any other changes in the amount of stolen
>>> memory we detect (well, couldn't check shard-glkb due to lack fo results).
>>
>> There is GLK in Farm1, so you can check the bootlogs from fast-feedback
>> run if that part is what you're interested in. All the other gens too.
> 
> I did check all the BAT runs. But I have no idea if the glks there are
> the same board as what we have as gklb in the shard. So not sure if I
> actually checked all the N machine types we have, or just N-1.

Shards have two different boards, and one of those is in Farm1 too. 
Farm1 has different eDP panel, though.

Tomi
-- 
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo