<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><span class="vcard"><a class="email" href="mailto:arkadiusz.hiler@intel.com" title="Arek Hiler <arkadiusz.hiler@intel.com>"> <span class="fn">Arek Hiler</span></a>
</span> changed
          <a class="bz_bug_link 
          bz_status_REOPENED "
   title="REOPENED - [CI] igt@kms_flip@* - fail - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=105702">bug 105702</a>
          <br>
             <table border="1" cellspacing="0" cellpadding="8">
          <tr>
            <th>What</th>
            <th>Removed</th>
            <th>Added</th>
          </tr>

         <tr>
           <td style="text-align:right;">CC</td>
           <td>
                
           </td>
           <td>arkadiusz.hiler@intel.com
           </td>
         </tr></table>
      <p>
        <div>
            <b><a class="bz_bug_link 
          bz_status_REOPENED "
   title="REOPENED - [CI] igt@kms_flip@* - fail - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=105702#c11">Comment # 11</a>
              on <a class="bz_bug_link 
          bz_status_REOPENED "
   title="REOPENED - [CI] igt@kms_flip@* - fail - Failed assertion: (drmWaitVBlank(drm_fd, &wait)) == 0"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=105702">bug 105702</a>
              from <span class="vcard"><a class="email" href="mailto:arkadiusz.hiler@intel.com" title="Arek Hiler <arkadiusz.hiler@intel.com>"> <span class="fn">Arek Hiler</span></a>
</span></b>
        <pre>This is an old one that we haven't seen for 3.5 months. Prior to that the
reproduction rate varied quite a bit. Shortly before disappearance it was seen
about once every two weeks. This means we can close the bug after 20 weeks ~= 5
months, so around mid August 2019.

As of what happens, code explains it pretty well (from
kms_flip.c/calibrate_ts()):

        memset(&wait, 0, sizeof(wait));
        wait.request.type = kmstest_get_vbl_flag(crtc_idx);
        wait.request.type |= DRM_VBLANK_RELATIVE | DRM_VBLANK_NEXTONMISS;
        do_or_die(drmWaitVBlank(drm_fd, &wait));

        last_seq = wait.reply.sequence;
        last_timestamp = wait.reply.tval_sec;
        last_timestamp *= 1000000;
        last_timestamp += wait.reply.tval_usec;

        memset(&wait, 0, sizeof(wait));
        wait.request.type = kmstest_get_vbl_flag(crtc_idx);
        wait.request.type |= DRM_VBLANK_ABSOLUTE | DRM_VBLANK_EVENT;
        wait.request.sequence = last_seq;
        for (n = 0; n < CALIBRATE_TS_STEPS; n++) {
                drmVBlank check = {};

                ++wait.request.sequence;
                do_or_die(drmWaitVBlank(drm_fd, &wait));

                /* Double check that haven't already missed the vblank */
                check.request.type = kmstest_get_vbl_flag(crtc_idx);
                check.request.type |= DRM_VBLANK_RELATIVE;
                do_or_die(drmWaitVBlank(drm_fd, &check));

                igt_assert(!igt_vblank_after(check.reply.sequence,
wait.request.sequence));
        }

So we are waiting for a begining of next vblank to get its seq number
(NEXTONMISS), then for the next CALIBRATE_TS_STEPS we are waiting for the very
next vblank (DRM_VBLANK_ABSOLUTE, ++wait.request.sequence) and double checking.

drmWaitVBlank(drm_fd, &wait) seems to be failing with -EBUSY:
    do {
       ret = ioctl(fd, DRM_IOCTL_WAIT_VBLANK, vbl);
       vbl->request.type &= ~DRM_VBLANK_RELATIVE;
       if (ret && errno == EINTR) {
           clock_gettime(CLOCK_MONOTONIC, &cur);
           /* Timeout after 1s */
           if (cur.tv_sec > timeout.tv_sec + 1 ||
               (cur.tv_sec == timeout.tv_sec && cur.tv_nsec >=
                timeout.tv_nsec)) {
                   errno = EBUSY;
                   ret = -1;
                   break;
           }
       }
    } while (ret && errno == EINTR);

out:
    return ret;


Seems like we are missing the vblank with given seq and then bailing out after
1s with EBUSY, which sounds rather serious and may point to a bug in the
kernel, as this amount of code should not take to long even with unfavorable
scheduling. Let's see whether it reappears, but hopefully it was "accidentally
fixed."</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the QA Contact for the bug.</li>
          <li>You are on the CC list for the bug.</li>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>