[Mesa-dev] GPU lockup CP stall when calling clBuildProgram on Cayman
Tom Stellard
tom at stellard.net
Mon Jan 13 10:00:52 PST 2014
On Thu, Jan 09, 2014 at 02:57:20PM +0000, christophe choquet wrote:
> Hi,
>
> I am using kernel 3.12.6-gentoo, Mesa 10.0.1 and once every two calls to clBuildProgram, the GPU goes to reset after 10 seconds.
> This also happens on Debian unstable with Mesa 9.2. First hello_world works, the next one hangs, third works, and so on.
>
> Despite this hang on this particular OpenCL call, every thing is just fine. I tried to comment out DMA flushing code in r600/r600_hw_context.c, but this issue does not look the one that what was discovered on R600 HW.
>
> After the hang, opencl_examples/hello_world returns the correct value (when the machine does not hang completely which happens sometimes). Same behaviour for get-global-id test program.
>
This is likely the same issues as https://bugs.freedesktop.org/show_bug.cgi?id=73418
Are you running the OpenCL programs with or without X? Can you reply in the comments of the bug.
Thanks,
Tom
> Here is my config & logs:
> lscpi:
> 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cayman PRO [Radeon HD 6950]
>
> dmesg:
> [ 826.250105] radeon 0000:01:00.0: GPU lockup CP stall for more than 10000msec
> [ 826.250110] radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000037bc last fence id 0x00000000000037ba)
> [ 826.250118] [drm] Disabling audio 0 support
> [ 826.257466] radeon 0000:01:00.0: Saved 111 dwords of commands on ring 0.
> [ 826.257496] radeon 0000:01:00.0: GPU softreset: 0x00000008
> [ 826.257498] radeon 0000:01:00.0: GRBM_STATUS = 0xB0001828
> [ 826.257500] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000003
> [ 826.257502] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000003
> [ 826.257504] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0
> [ 826.257526] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
> [ 826.257528] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
> [ 826.257529] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x40000000
> [ 826.257531] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00010006
> [ 826.257533] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80228647
> [ 826.257535] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
> [ 826.257537] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
> [ 826.257539] radeon 0000:01:00.0: VM_CONTEXT0_PROTECTION_FAULT_ADDR 0x00000000
> [ 826.257541] radeon 0000:01:00.0: VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x00000000
> [ 826.257542] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000
> [ 826.257544] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
> [ 826.264350] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00004001
> [ 826.264403] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
> [ 826.265558] radeon 0000:01:00.0: GRBM_STATUS = 0x00001828
> [ 826.265560] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000003
> [ 826.265561] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000003
> [ 826.265563] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0
> [ 826.265585] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000
> [ 826.265587] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000
> [ 826.265589] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000
> [ 826.265590] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000
> [ 826.265592] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000
> [ 826.265594] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57
> [ 826.265596] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57
> [ 826.265623] radeon 0000:01:00.0: GPU reset succeeded, trying to resume
> [ 826.283559] [drm] PCIE gen 2 link speeds already enabled
> [ 826.285981] [drm] PCIE GART of 1024M enabled (table at 0x0000000000273000).
> [ 826.286049] radeon 0000:01:00.0: WB enabled
> [ 826.286051] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff8800cbaa3c00
> ......
>
>
> On hello_world.c program hangs every two calls at line:
> error = clBuildProgram(program,
> 1, /* Number of devices */
> &device_id,
> NULL, /* options */
> NULL, /* callback function when compile is complete */
> NULL); /* user data for callback */
>
>
> Thanks for your help,
> Regards
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list