[Bug 110394] New: Skylake GPU HANG while gstreamer H264 vaapi encoding from MJPEG vaapi decode on drm-tip

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Wed Apr 10 17:41:17 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=110394

            Bug ID: 110394
           Summary: Skylake GPU HANG while gstreamer H264 vaapi encoding
                    from MJPEG vaapi decode on drm-tip
           Product: DRI
           Version: DRI git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: critical
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: andy.nicholas at shield.ai
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 143919
  --> https://bugs.freedesktop.org/attachment.cgi?id=143919&action=edit
GPU crash info plus dmesg log

Similar to Bug #110297 which I filed.

Skylake GPU hang when encoding video stream to H.264 using VAAPI. The stream is
decoded from a VAAPI MJPEG stream from a file. We run test loops where we
transcode this stream over and over, thousands of times. This GPU hang happened
on iteration 1026.

Running on Intel Compute Stick STK2mV64CC. We have locked the minimum and
maximum clock speeds of the GPU to 500 Mhz to attempt to avoid... this issue.

We are running this test because one of our products needs to have this exact
configuration: read an MJPEG stream from a V4L camera and transcode into H264.
This configuration needs to be super stable. Crashing once in 1026 iterations
is not considered "stable".

Using Ubuntu 18.04 plus DRM-TIP kernel from about 3 weeks ago which corresponds
with 5.1-rc1.

Using GStreamer 1.14.1:

shield at tobeprovisioned1804:~$ gst-launch-1.0 --version
gst-launch-1.0 version 1.14.1
GStreamer 1.14.1
https://launchpad.net/distros/ubuntu/+source/gstreamer1.0

Full GPU hang log and dmesg enclosed. This is related to a similar bug which I
previously filed.

Especially concerning is that the machine is usable (but the GPU seems dead)
after this crash. We would like to figure out a way of determining that the GPU
has died and to kernel panic so that we can, eventually, reboot. Modifying the
kernel is A-OK to avoid this issue, so if Intel doesn't have a mechanism then I
will try to add something myself.

Leaving the machine in this "half dead" state is bad. We can't use the
gstreamer process termination as the "reboot the machine" trigger as we may
have other, less severe, bugs where we simply want to restart the gstreamer
process.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190410/3e15ad8e/attachment-0001.html>


More information about the intel-gfx-bugs mailing list