[Bug 110297] New: GPU HANG when transcoding to H.264 using VAAPI on drm-tip
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Apr 1 07:17:43 UTC 2019
https://bugs.freedesktop.org/show_bug.cgi?id=110297
Bug ID: 110297
Summary: GPU HANG when transcoding to H.264 using VAAPI on
drm-tip
Product: DRI
Version: DRI git
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: major
Priority: medium
Component: DRM/Intel
Assignee: intel-gfx-bugs at lists.freedesktop.org
Reporter: andy.nicholas at shield.ai
QA Contact: intel-gfx-bugs at lists.freedesktop.org
CC: intel-gfx-bugs at lists.freedesktop.org
Created attachment 143827
--> https://bugs.freedesktop.org/attachment.cgi?id=143827&action=edit
GPU hang after transcoding with VAAPI
Hi, I used the drm-tip kernel to reproduce a bad problem we have when
transcoding video on an Intel Compute Stick (STK2MV64CC). For our product
experiments we are always transcoding, so if the GPU hangs or crashes that's
exceptionally bad for us. We have sporadic reports from our testing group when
the kernel crashes, so I setup a test rig to reproduce issue.
I reproduced the problem after running approximately 2000 transcodes of an
1920x1080 mp4 (big buck bunny) from H.264 back to H.264 using gstreamer on
Ubuntu 18.04.2, but the kernel was DRM-TIP from Kernel 5.1-rc6 (about 2 weeks
ago). I'm assuming the issue is reproducible and will continue to try to
reproduce it -- in the meantime, I'm filing the bug since time is urgent for
me.
[96339.653213] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on vcs0,
vecs0
[96339.653215] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
stack, including userspace.
[96339.653216] [drm] Please file a _new_ bug report on bugs.freedesktop.org
against DRI -> DRM/Intel
[96339.653217] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[96339.653218] [drm] The gpu crash dump is required to analyze gpu hangs, so
please always attach it.
[96339.653220] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Full DMESG and log from /sys/Class/drm/card0 is enclosed. The script I used to
repro the bug is enclosed.
My DRM-TIP kernel is from:
commit 00cb3798a5d008c3f824fe7c89c663dba66155c3 (HEAD -> drm-tip,
origin/drm-tip, origin/HEAD)
Author: Rodrigo Vivi <rodrigo.vivi at intel.com>
Date: Fri Mar 22 12:52:43 2019 -0700
These config switches were ADDED to DRM-TIP so I could boot from eMMC and
configure for lower kernel latency and see serial output when the GPU goes
bonkers:
CONFIG_USB_SERIAL=y
CONFIG_USB_SERIAL_CONSOLE=y
CONFIG_USB_SERIAL_FTDI_SIO=y
CONFIG_USB_PL2303=y
CONFIG_FRAME_POINTER=y
CONFIG_LATENCYTOP=y
CONFIG_MMC=y
CONFIG_MMC_BLOCK=y
CONFIG_MMC_BLOCK_MINORS=8
CONFIG_MMC_SDHCI=y
CONFIG_MMC_SDHCI_PCI=y
CONFIG_MMC_RICOH_MMC=y
CONFIG_MMC_SDHCI_ACPI=y
CONFIG_DEBUG_INFO=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KEXEC_FILE=y
CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
CONFIG_KEXEC_JUMP=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
CONFIG_DRM_I915_DEBUG=y
CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y
CONFIG_USB_RTL8152=y
CONFIG_USB_NET_DRIVERS=y
Transcoding loop is just this below:
#!/usr/bin/env bash
set -ex
tcount=0
while true; do
echo "Transcode: iteration $tcount"
# remove old output
rm -f /tmp/transcode-output.mp4
# transcode big-buck-bunny.mp4 using gstreamer
time gst-launch-1.0 filesrc location=big-buck-bunny.mp4 !
qtdemux ! queue ! vaapidecodebin ! vaapih264enc ! qtmux ! filesink
location=/tmp/gst-output.mp4
tcount=$((tcount+1))
done
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20190401/12c4f84b/attachment.html>
More information about the intel-gfx-bugs
mailing list