[Bug 110297] New: GPU HANG when transcoding to H.264 using VAAPI on drm-tip

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Apr 1 07:17:43 UTC 2019


https://bugs.freedesktop.org/show_bug.cgi?id=110297

            Bug ID: 110297
           Summary: GPU HANG when transcoding to H.264 using VAAPI on
                    drm-tip
           Product: DRI
           Version: DRI git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: andy.nicholas at shield.ai
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Created attachment 143827
  --> https://bugs.freedesktop.org/attachment.cgi?id=143827&action=edit
GPU hang after transcoding with VAAPI

Hi, I used the drm-tip kernel to reproduce a bad problem we have when
transcoding video on an Intel Compute Stick (STK2MV64CC). For our product
experiments we are always transcoding, so if the GPU hangs or crashes that's
exceptionally bad for us. We have sporadic reports from our testing group when
the kernel crashes, so I setup a test rig to reproduce issue.

I reproduced the problem after running approximately 2000 transcodes of an
1920x1080 mp4 (big buck bunny) from H.264 back to H.264 using gstreamer on
Ubuntu 18.04.2, but the kernel was DRM-TIP from Kernel 5.1-rc6 (about 2 weeks
ago). I'm assuming the issue is reproducible and will continue to try to
reproduce it -- in the meantime, I'm filing the bug since time is urgent for
me.


[96339.653213] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on vcs0,
vecs0
[96339.653215] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[96339.653216] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
[96339.653217] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[96339.653218] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[96339.653220] [drm] GPU crash dump saved to /sys/class/drm/card0/error

Full DMESG and log from /sys/Class/drm/card0 is enclosed. The script I used to
repro the bug is enclosed.



My DRM-TIP kernel is from:

commit 00cb3798a5d008c3f824fe7c89c663dba66155c3 (HEAD -> drm-tip,
origin/drm-tip, origin/HEAD)
Author: Rodrigo Vivi <rodrigo.vivi at intel.com>
Date:   Fri Mar 22 12:52:43 2019 -0700


These config switches were ADDED to DRM-TIP so I could boot from eMMC and
configure for lower kernel latency and see serial output when the GPU goes
bonkers:

CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_FTDI_SIO=y
CONFIG_USB_PL2303=y
CONFIG_FRAME_POINTER=y
CONFIG_LATENCYTOP=y
CONFIG_MMC=y
CONFIG_MMC_BLOCK=y
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_SDHCI=y
CONFIG_MMC_SDHCI_PCI=y
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=y
CONFIG_DEBUG_INFO=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KEXEC_FILE=y
CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
CONFIG_KEXEC_JUMP=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
CONFIG_DRM_I915_DEBUG=y
CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y
CONFIG_USB_RTL8152=y
CONFIG_USB_NET_DRIVERS=y


Transcoding loop is just this below:

#!/usr/bin/env bash

set -ex

tcount=0
while true; do
                echo "Transcode: iteration $tcount"

                # remove old output
                rm -f /tmp/transcode-output.mp4

                # transcode big-buck-bunny.mp4 using gstreamer
                time gst-launch-1.0 filesrc location=big-buck-bunny.mp4 !
qtdemux ! queue ! vaapidecodebin ! vaapih264enc ! qtmux ! filesink
location=/tmp/gst-output.mp4

                tcount=$((tcount+1))    
done

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190401/12c4f84b/attachment.html>


More information about the intel-gfx-bugs mailing list