[Bug 109088] [CI][DRMTIP]igt at kms_chv_cursor_fail@pipe-c-128x128-top-edge - Fail - Failed assertion: igt_ioctl((fd), ((((2U|1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0xB2)) << 0) | ((((sizeof(struct drm_mode_create_dumb)))) << ((0+8)+8)))), (&create)) == 0

Mon Sep 9 20:56:16 UTC 2019

https://bugs.freedesktop.org/show_bug.cgi?id=109088

Matt Roper <matthew.d.roper at intel.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |martin.peres at free.fr,
                   |                            |matthew.d.roper at intel.com

--- Comment #8 from Matt Roper <matthew.d.roper at intel.com> ---
Ultimately the failure here originates on the test side rather than the driver
side; the final cause of failure here is pretty simple:  IGT's fb creation
routine is calling DRM_IOCTL_MODE_CREATE_DUMB asking for a framebuffer with
dimensions 0x0, which is illegal and rejected by the kernel.  However why
that's happening is a bit more complicated; a common pattern in our IGT tests
when we decide to utilize a specific output is to call igt_output_get_mode() to
get the size of the display and then immediately create a framebuffer of that
size (e.g., via igt_create_color_fb).  However if the output is considered
disconnected by the time we get to this point in the test, the output's mode
list and default mode will have been zeroed out and mode->hdisplay /
mode->vdisplay will be 0 which ultimately leads to the failure here during fb
creation.

There are a lot of different tests that follow this pattern (and thus are
susceptible to this bug), and in the dmesg logs I looked at I always saw
something that was updating connector state during the test -- either hotplug
events or temporary failures to get DPCD responses from DP++ adapters.  I
assume nobody is actually plugging/unplugging the cables while these tests are
running, so this may just be random fallout of faulty or loose cables on the CI
machines.

We probably need to write our tests more defensively to avoid situations like
this somehow.  We know that 0x0 isn't a valid framebuffer size that we actually
want to create, so one solution might be to just force a test skip in
create_bo_for_fb() any time we see a 0x0 size requested, under the assumption
that we only get to this point because the calling test lost its monitor but
didn't notice.  That seems like kind of an indirect solution to the problem
here, so I'm not super happy with it.  This really feels like more of an
igt_kms wrapper design issue (which is something that I'm still not 100%
familiar with despite having worked with KMS IGT's for a long time).  Finally
we could write more defensive logic into each individual test to account for
hot-unplugged monitors and such at the point we go to create fb's.

I think it would be good to have someone on the CI team (or an expert on the
igt_kms wrapper library) make a judgment call on what the proper path forward
is on this.  @Martin, any thoughts?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190909/ff1725f1/attachment.html>