[Bug 104899] New: GPU HANG after entering a match in Team Fortress 2. fedora 27

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Feb 1 09:43:31 UTC 2018


https://bugs.freedesktop.org/show_bug.cgi?id=104899

            Bug ID: 104899
           Summary: GPU HANG after entering a match in Team Fortress 2.
                    fedora 27
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: Linux (All)
            Status: NEW
          Severity: blocker
          Priority: medium
         Component: DRM/Intel
          Assignee: intel-gfx-bugs at lists.freedesktop.org
          Reporter: avinoash at gmail.com
        QA Contact: intel-gfx-bugs at lists.freedesktop.org
                CC: intel-gfx-bugs at lists.freedesktop.org

Steps to reproduce the issue:
boot PC => launch Steam => start Team Fortress 2 => enter a match => gpu hangs.
the above steps trigger the issue every time (always),
game is not playable as it freezes the PC till game crashs to desktop after a
minute.


the journal that brought me here:
kernel: [drm] GPU HANG: ecode 9:0:0x85dffffb, in hl2_linux [3347], reason: Hang
on rcs0, action: reset
kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack,
including userspace.
kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against
DRI -> DRM/Intel
kernel: [drm] drm/i915 developers can then reassign to the right component if
it's not a kernel issue.
kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please
always attach it.
kernel: [drm] GPU crash dump saved to /sys/class/drm/card1/error


i915 platform:
~]$ sudo lshw -c video
  *-display                 
       description: VGA compatible controller
       product: Skylake GT2 [HD Graphics 520]
       vendor: Intel Corporation
       physical id: 2
       bus info: pci at 0000:00:02.0
       version: 07
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
       configuration: driver=i915 latency=0
       resources: irq:130 memory:e1000000-e1ffffff memory:c0000000-cfffffff
ioport:f000(size=64) memory:c0000-dffff
  *-display
       description: Display controller
       product: Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: pci at 0000:01:00.0
       version: 81
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:129 memory:d0000000-dfffffff memory:e0000000-e01fffff
ioport:e000(size=256) memory:e0200000-e023ffff memory:e0240000-e025ffff

~]$ modinfo i915 #excluding aliases and signature from the output
filename:      
/lib/modules/4.14.14-300.fc27.x86_64/kernel/drivers/gpu/drm/i915/i915.ko.xz
license:        GPL and additional rights
description:    Intel Graphics
author:         Intel Corporation
author:         Tungsten Graphics, Inc.
firmware:       i915/bxt_dmc_ver1_07.bin
firmware:       i915/skl_dmc_ver1_26.bin
firmware:       i915/kbl_dmc_ver1_01.bin
firmware:       i915/kbl_guc_ver9_14.bin
firmware:       i915/bxt_guc_ver8_7.bin
firmware:       i915/skl_guc_ver6_1.bin
firmware:       i915/kbl_huc_ver02_00_1810.bin
firmware:       i915/bxt_huc_ver01_07_1398.bin
firmware:       i915/skl_huc_ver01_07_1398.bin
depends:        drm_kms_helper,drm,video,i2c-algo-bit
intree:         Y
name:           i915
vermagic:       4.14.14-300.fc27.x86_64 SMP mod_unload 
sig_id:         PKCS#7
signer:         
sig_key:        
sig_hashalgo:   md4
parm:           modeset:Use kernel modesetting [KMS] (0=disable, 1=on, -1=force
vga console preference [default]) (int)
parm:           panel_ignore_lid:Override lid status (0=autodetect,
1=autodetect disabled [default], -1=force lid closed, -2=force lid open) (int)
parm:           semaphores:Use semaphores for inter-ring sync (default: -1 (use
per-chip defaults)) (int)
parm:           enable_rc6:Enable power-saving render C-state 6. Different
stages can be selected via bitmask values (0 = disable; 1 = enable rc6; 2 =
enable deep rc6; 4 = enable deepest rc6). For example, 3 would enable rc6 and
deep rc6, and 7 would enable everything. default: -1 (use per-chip default)
(int)
parm:           enable_dc:Enable power-saving display C-states. (-1=auto
[default]; 0=disable; 1=up to DC5; 2=up to DC6) (int)
parm:           enable_fbc:Enable frame buffer compression for power savings
(default: -1 (use per-chip default)) (int)
parm:           lvds_channel_mode:Specify LVDS channel mode (0=probe BIOS
[default], 1=single-channel, 2=dual-channel) (int)
parm:           lvds_use_ssc:Use Spread Spectrum Clock with panels [LVDS/eDP]
(default: auto from VBT) (int)
parm:           vbt_sdvo_panel_type:Override/Ignore selection of SDVO panel
mode in the VBT (-2=ignore, -1=auto [default], index in VBT BIOS table) (int)
parm:           reset:Attempt GPU resets (0=disabled, 1=full gpu reset,
2=engine reset [default]) (int)
parm:           vbt_firmware:Load VBT from specified file under /lib/firmware
(charp)
parm:           error_capture:Record the GPU state following a hang. This
information in /sys/class/drm/card<N>/error is vital for triaging and debugging
hangs. (bool)
parm:           enable_hangcheck:Periodically check GPU activity for detecting
hangs. WARNING: Disabling this can cause system wide hangs. (default: true)
(bool)
parm:           enable_ppgtt:Override PPGTT usage. (-1=auto [default],
0=disabled, 1=aliasing, 2=full, 3=full with extended address space) (int)
parm:           enable_execlists:Override execlists usage. (-1=auto [default],
0=disabled, 1=enabled) (int)
parm:           enable_psr:Enable PSR (0=disabled, 1=enabled - link mode chosen
per-platform, 2=force link-standby mode, 3=force link-off mode) Default: -1
(use per-chip default) (int)
parm:           alpha_support:Enable alpha quality driver support for latest
hardware. See also CONFIG_DRM_I915_ALPHA_SUPPORT. (bool)
parm:           disable_power_well:Disable display power wells when possible
(-1=auto [default], 0=power wells always on, 1=power wells disabled when
possible) (int)
parm:           enable_ips:Enable IPS (default: true) (int)
parm:           fastboot:Try to skip unnecessary mode sets at boot time
(default: false) (bool)
parm:           prefault_disable:Disable page prefaulting for
pread/pwrite/reloc (default:false). For developers only. (bool)
parm:           load_detect_test:Force-enable the VGA load detect code for
testing (default:false). For developers only. (bool)
parm:           force_reset_modeset_test:Force a modeset during gpu reset for
testing (default:false). For developers only. (bool)
parm:           invert_brightness:Invert backlight brightness (-1 force normal,
0 machine defaults, 1 force inversion), please report PCI device ID, subsystem
vendor and subsystem device ID to dri-devel at lists.freedesktop.org, if your
machine needs it. It will then be included in an upcoming module version. (int)
parm:           disable_display:Disable display (default: false) (bool)
parm:           enable_cmd_parser:Enable command parsing (true=enabled
[default], false=disabled) (bool)
parm:           use_mmio_flip:use MMIO flips (-1=never, 0=driver discretion
[default], 1=always) (int)
parm:           mmio_debug:Enable the MMIO debug code for the first N failures
(default: off). This may negatively affect performance. (int)
parm:           verbose_state_checks:Enable verbose logs (ie. WARN_ON()) in
case of unexpected hw state conditions. (bool)
parm:           nuclear_pageflip:Force enable atomic functionality on platforms
that don't have full support yet. (bool)
parm:           edp_vswing:Ignore/Override vswing pre-emph table selection from
VBT (0=use value from vbt [default], 1=low power swing(200mV),2=default
swing(400mV)) (int)
parm:           enable_guc_loading:Enable GuC firmware loading (-1=auto,
0=never [default], 1=if available, 2=required) (int)
parm:           enable_guc_submission:Enable GuC submission (-1=auto, 0=never
[default], 1=if available, 2=required) (int)
parm:           guc_log_level:GuC firmware logging level (-1:disabled
(default), 0-3:enabled) (int)
parm:           guc_firmware_path:GuC firmware path to use instead of the
default one (charp)
parm:           huc_firmware_path:HuC firmware path to use instead of the
default one (charp)
parm:           enable_dp_mst:Enable multi-stream transport (MST) for new
DisplayPort sinks. (default: true) (bool)
parm:           inject_load_failure:Force an error after a number of failure
check points (0:disabled (default), N:force failure at the Nth failure check
point) (uint)
parm:           enable_dpcd_backlight:Enable support for DPCD backlight control
(default:false) (bool)
parm:           enable_gvt:Enable support for Intel GVT-g graphics
virtualization host support(default:false) (bool)


system architecture:
~]$ uname -m
x86_64


kernel version:
~]$ uname -r
4.14.14-300.fc27.x86_64


Linux distribution:
~]$ cat /etc/fedora-release 
Fedora release 27 (Twenty Seven)


Machine or mother board model:
~]$ dmidecode -t 2
# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: Dell Inc.
        Product Name: 0DVKGM
        Version: A00
        Serial Number: /F89MFC2/CN129636620011/
        Asset Tag: Not Specified
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: Not Specified
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0


Display connector:
I have three monitors, each connected to my docking station via a different
connector: VGA, DVI, DP.
But the also gpu hang happens if I only use the build in display alone without
the docking statuion.


A full dmesg with debug information:
attached.
(I dont think it has the debug information you are looking for...
I was not sure how to produce it, what's the "kernel command line"?..)

GPU crash dump:
attached.
(sorry for not bz2'ing the file...)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20180201/d5a59215/attachment.html>


More information about the intel-gfx-bugs mailing list