[Nouveau] NV50 compute support questions

Samuel Pitoiset samuel.pitoiset at gmail.com
Fri Dec 4 01:21:54 PST 2015



On 12/04/2015 10:12 AM, Hans de Goede wrote:
> Hi,
>
> On 04-12-15 09:54, Samuel Pitoiset wrote:
>>
>>
>> On 12/04/2015 09:45 AM, Hans de Goede wrote:
>
> <snip>
>
>>>> Please give a shot at this branch :
>>>> http://cgit.freedesktop.org/~hakzsam/mesa/log/?h=nvf0_compute
>>>>
>>>> It fixes the initialization of the compute state and allows me to
>>>> launch 'test_input_global' (ie. ./compute 8) on my GK208 without
>>>> any dmesg fails. That's a good start but more patches are coming. :-)
>>>
>>> This branch indeed works somewhat better, but things still hang on the
>>>
>>> test_system_values compute test for me (this is the first test executed
>>> I did not try the others). So this seems to need more work.
>>
>> What about test_input_global? test_system_values doesn't work on my
>> side but it doesn't hang the GPU.
>
> Yes that one works.
>
>> Could you please provide dmesg log?
>
> [    2.786631] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1)
> [    2.914291] nouveau 0000:01:00.0: bios: version 80.28.79.00.0b
> [    2.937909] nouveau 0000:01:00.0: priv: HUB0: 086014 ffffffff (1f70820c)
> [    2.937953] nouveau 0000:01:00.0: fb: 1024 MiB DDR3
> [    3.623202] [TTM] Zone  kernel: Available graphics memory: 2010556 kiB
> [    3.623205] [TTM] Initializing pool allocator
> [    3.623241] [TTM] Initializing DMA pool allocator
> [    3.623440] nouveau 0000:01:00.0: DRM: VRAM: 1024 MiB
> [    3.623442] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
> [    3.623447] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
> [    3.623449] nouveau 0000:01:00.0: DRM: DCB version 4.0
> [    3.623451] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
> [    3.623454] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010
> [    3.623456] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 00000000
> [    3.623458] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031
> [    3.623460] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161
> [    3.623462] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200
> [    3.627283] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [    3.627285] [drm] Driver supports precise vblank timestamp query.
> [    3.671871] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
> [    3.889940] nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb:
> 0x60000, bo ffff880119050000
> [    3.890952] fbcon: nouveaufb (fb0) is primary device
> [    4.132343] Console: switching to colour frame buffer device 240x67
> [    4.134930] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
> [    4.141094] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0
> on minor 0
>
> <snip>
>
> [ 1713.421460] nouveau 0000:01:00.0: gr: TRAP ch 6 [003fa32000
> compute[21117]]
> [ 1713.421471] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global
> 00000000 [] warp 3000e [MEM_OUT_OF_BOUNDS]
> [ 1713.441248] nouveau 0000:01:00.0: gr: TRAP ch 6 [003fa32000
> compute[21117]]
> [ 1713.441260] nouveau 0000:01:00.0: gr: GPC0/TPC0/MP trap: global
> 00000004 [MULTIPLE_WARP_ERRORS] warp 20005 [MISALIGNED_PC]
> [ 1713.441265] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global
> 00000004 [MULTIPLE_WARP_ERRORS] warp 20005 [MISALIGNED_PC]
> [ 1717.773839] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1717.773848] nouveau 0000:01:00.0: fifo: sw engine fault on channel 2,
> recovering...
> [ 1719.776529] nouveau 0000:01:00.0: fifo: runlist 0 update timeout
> [ 1722.068923] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1726.363660] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1730.658395] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1734.951720] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1739.241861] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1743.532005] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1747.826728] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1752.121462] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1756.416200] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1760.710930] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1765.005663] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1769.300396] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1773.595135] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1777.889863] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1782.184598] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1786.479328] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1789.730020] nouveau 0000:01:00.0: compute[21117]: failed to idle
> channel 6 [compute[21117]]
> [ 1790.774060] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1791.729963] nouveau 0000:01:00.0: timeout at
> drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c:47/gk104_fifo_gpfifo_kick()!
>
> [ 1791.729966] nouveau 0000:01:00.0: fifo: channel 6 [compute[21117]]
> kick timeout
> [ 1791.729973] nouveau: compute[21117]:00000000:0000a06f: detach gr
> failed, -16
> [ 1791.731401] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0d []
> [ 1793.731275] nouveau 0000:01:00.0: timeout at
> drivers/gpu/drm/nouveau/nvkm/engine/fifo/gpfifogk104.c:47/gk104_fifo_gpfifo_kick()!
>
> [ 1793.731279] nouveau 0000:01:00.0: fifo: channel 6 [compute[21117]]
> kick timeout
> [ 1793.731281] nouveau: compute[21117]:00000000:0000a06f: detach sw
> failed, -16
> [ 1796.026118] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1800.320809] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1804.615446] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1808.731016] nouveau 0000:01:00.0: compute[21117]: failed to idle
> channel 6 [compute[21117]]
> [ 1808.738716] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0d []
> [ 1810.738093] nouveau 0000:01:00.0: fifo: runlist 0 update timeout
> [ 1810.738106] nouveau 0000:01:00.0: fifo: BIND_ERROR 03
> [UNBIND_WHILE_RUNNING]
> [ 1815.032747] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1819.327395] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
> [ 1823.622036] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
>
> <last line keeps repeating at aprox 4 sec interval>

Thanks, I'll have a look and try to provide a fix for your GK208b.

>
>>> I've ordered a GTX740 (GK107) card, which should arrive soon, and
>>> I'll be using that so I can (hopefully) focus on the llvm tgsi bits
>>> again.
>>
>> Yeah, GK107 will do the job. :-)
>
> Good a said it should arrive soon.
>
>>>> Btw, according to the trace you sent me, you have a GK208b (NV106).
>>>
>>> Right, sorry I thought the differences between GK208 and GK208b would
>>> not matter.
>>
>> I don't know exactly the differences between these two chipsets but
>> since test_system_values hangs your GPU and not mine, I think they are
>> some.
>
> Ok.
>
> Regards,
>
> Hans

-- 
-Samuel


More information about the Nouveau mailing list