Is my Radeon HD 6970M dying? Hangs & init problems

Rafał Miłecki zajec5 at gmail.com
Mon Feb 9 15:06:44 PST 2015


My notebook Samsung NP700G7A-S01PL was working stable for more than 2 years.
I was using 3.11, 3.17, 3.18, 3.19 (since rc1) and many more successfully.
First hang has happened on 2015-02-08 (23:30) with 3.19-rc5 I was
using for 3 weeks.

So what I'm seeing are two possibly related problems:

1) Random hangs
I don't have to be doing anything unusual. A single display, no UVD,
just writing some code in kate. And then it randomly happens. My
screen goes all white or green vertical lines or blue vertical lines.
I can't use/access my machine, sound goes into a loop (last second).
Sometimes it happens after hours, sometimes 30 minutes, sometimes few
minutes. So far I got 5-7 hangs like this.

2) Init problems
Unfortunately rebooting does not always help. Even cold boot (removing
power & battery, keeping power button pressed for few seconds) isn't
helpful.
a) First I get UVD init errors:
*ERROR* UVD not responding, trying to reset the VCPU!!!
b) Then machine hangs after displaying "pitch is 7680"
I've tracked it to be somewhere near register_framebuffer
(see attached bad.txt)

As long as I don't use radeon (booting with "nomodeset") it works stable.

I tested my RAM with MemTest86 (one pass, took 1 hour), no errors, CPU
temperature didn't exceed 70 degrees.

This evening as the last hope I installed fglrx. It hangs my machine
as well with following messages:
[   36.472526] console [netcon0] enabled
[   36.473106] netconsole: network logging started
[   48.192215] fglrx_pci 0000:01:00.0: irq 56 for MSI/MSI-X
[   48.192726] <6>[fglrx] Firegl kernel thread PID: 1481
[   48.192833] <6>[fglrx] Firegl kernel thread PID: 1482
[   48.192954] <6>[fglrx] Firegl kernel thread PID: 1483
[   48.193077] <6>[fglrx] IRQ 56 Enabled
[   48.240118] <6>[fglrx] Reserved FB block: Shared offset:0, size:1000000
[   48.240122] <6>[fglrx] Reserved FB block: Unshared offset:3fab4000, size:4000
[   48.240124] <6>[fglrx] Reserved FB block: Unshared offset:3fab8000,
size:548000
[   48.240126] <6>[fglrx] Reserved FB block: Unshared offset:7fff3000, size:d000
However if I drop fglrx.ko and just use Xorg driver fglrx_drv.so it
works stable.

Any ideas? Is GPU on my motherboard just dying? :|

-- 
Rafał
-------------- next part --------------
[   35.114273] [drm] radeon kernel modesetting enabled.
[   35.114709] [drm] initializing kernel modesetting (BARTS 0x1002:0x6720 0x144D:0xC0AD).
[   35.114871] [drm] register mmio base: 0xF7E20000
[   35.115012] [drm] register mmio size: 131072
[   35.115195] ATOM BIOS: Samsung
[   35.115465] radeon 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
[   35.115693] radeon 0000:01:00.0: GTT: 1024M 0x0000000080000000 - 0x00000000BFFFFFFF
[   35.115908] [drm] Detected VRAM RAM=2048M, BAR=1024M
[   35.116043] [drm] RAM width 256bits DDR
[   35.116262] [TTM] Zone  kernel: Available graphics memory: 4079754 kiB
[   35.116406] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[   35.116542] [TTM] Initializing pool allocator
[   35.116693] [TTM] Initializing DMA pool allocator
[   35.116845] [drm] radeon: 2048M of VRAM memory ready
[   35.116987] [drm] radeon: 1024M of GTT memory ready.
[   35.117134] [drm] Loading BARTS Microcode
[   35.197992] [drm] Internal thermal controller with fan control
[   35.198758] [drm] radeon: power management initialized
[   35.206487] [drm] GART: num cpu pages 262144, num gpu pages 262144
[   35.207542] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[   35.213991] [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000).
[   35.214171] radeon 0000:01:00.0: WB enabled
[   35.214254] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88026ad30c00
[   35.214364] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88026ad30c0c
[   35.214892] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90007a32118
[   35.215003] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   35.215083] [drm] Driver supports precise vblank timestamp query.
[   35.215163] radeon 0000:01:00.0: radeon: MSI limited to 32-bit
[   35.215281] radeon 0000:01:00.0: radeon: using MSI.
[   35.215382] [drm] radeon: irq initialized.
[   35.232059] [drm] ring test on 0 succeeded in 1 usecs
[   35.232145] [drm] ring test on 3 succeeded in 2 usecs
[   36.423302] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   37.447064] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   38.143477] nf_conntrack: automatic helper assignment is deprecated and it will be removed soon. Use the iptables CT target to attach helpers instead.
[   38.470826] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   39.494585] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   40.518338] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   41.542094] [drm:uvd_v1_0_start [radeon]] *	ERROR* UVD not responding, trying to reset the VCPU!!!
[   42.565857] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   43.589615] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   44.613374] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   45.637128] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, trying to reset the VCPU!!!
[   45.657321] [drm:uvd_v1_0_start [radeon]] *ERROR* UVD not responding, giving up!!!
[   45.657481] [drm:evergreen_startup [radeon]] *ERROR* radeon: error initializing UVD (-1).
[   45.657886] [drm] ib test on ring 0 succeeded in 0 usecs
[   45.657986] [drm] ib test on ring 3 succeeded in 0 usecs
[   45.660001] [drm] radeon atom DIG backlight initialized
[   45.660086] [drm] Radeon Display Connectors
[   45.660165] [drm] Connector 0:
[   45.660242] [drm]   eDP-1
[   45.660321] [drm]   HPD2
[   45.660399] [drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[   45.660510] [drm]   Encoders:
[   45.660594] [drm]     LCD1: INTERNAL_UNIPHY1
[   45.660674] [drm] Connector 1:
[   45.660760] [drm]   DP-1
[   45.660839] [drm]   HPD3
[   45.660920] [drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[   45.661027] [drm]   Encoders:
[   45.661104] [drm]     DFP1: INTERNAL_UNIPHY2
[   45.661182] [drm] Connector 2:
[   45.661258] [drm]   HDMI-A-1
[   45.661340] [drm]   HPD1
[   45.661416] [drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[   45.661526] [drm]   Encoders:
[   45.661612] [drm]     DFP2: INTERNAL_UNIPHY2
[   45.661696] [drm] Connector 3:
[   45.661778] [drm]   VGA-1
[   45.661859] [drm]   DDC: 0x64d8 0x64d8 0x64dc 0x64dc 0x64e0 0x64e0 0x64e4 0x64e4
[   45.661970] [drm]   Encoders:
[   45.662048] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   47.260949] [drm] fb mappable at 0x80475000
[   47.261041] [drm] vram apper at 0x80000000
[   47.261130] [drm] size 8294400
[   47.261209] [drm] fb depth is 24
[   47.261294] [drm]    pitch is 7680
[   47.261376] [drm_fb_helper_single_fb_probe:1050] calling register_framebuffer
[   47.261601] [do_register_framebuffer:1678] calling console_lock


More information about the dri-devel mailing list