[Intel-gfx] GPU hang with high media workload on BSW

Tang, Jun jun.tang at intel.com
Fri Jul 1 02:24:04 UTC 2016


Hi Guys,

Thanks for the help in advanced!

I'm encountering a GPU hang issue while running multiple channel H264 video decoding + VPP composition, display and also one channel H264 encoding on BSW.
It's a render ring stuck like below:
[58503.223700] [drm] stuck on render ring [58503.246340] [drm] GPU HANG: ecode 8:0:0x7f1d7e3d, in Challenge [3259], reason: Ring hung, action: reset

Attached the /sys/class/drm/card0/error, I suspect the hang is caused by the incorrect render ring buffer content:
In line 32959, the value of ring buffer is 18800001 (MI_BATCH_BUFFER_START_GEN8), but the next DWORD is 00100002. 
Since MI_BATCH_BUFFER_START_GEN8 should be followed by batch buffer address, I think the content of ring buffer is not correct.

00000000 :  18800001
00000004 :  00100002

To identify when the ring buffer is incorrectly programmed, I added some code to read the first DWORD of ring buffer back after intel_ring_emit in gen8_emit_pipe_control while tail of ring buffer is zero.
The result is: the read-back first DWORD of ring buffer is sometimes different from the data intel_ring_emit just writes when tail is 0. And just after this, GPU hang may happen.

Here is the output of my print:
[ 3409.067402] rcs b:0x18800001 d:0x7a000004 t:0

'b' - ioread32 (ringbuf->virtual_start)
'd' - intel_ring_emit wants to write
't' - the value of tail

I'm aware that ringbuf->virtual_start is write combine,  the read may led to write-combine buffer flush and slow read performance. 
But don't know why it's different from the value intel_ring_emit just writes? 

Also have another question, after CPU write to the WC ring buffer, how is WC buffer flushed before GPU start to read ring buffer? 

Thanks a lot!
-James

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 20160622_stuck.txt
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20160701/990f2900/attachment-0001.txt>


More information about the Intel-gfx mailing list