[Nouveau] GT 730 freeze : how do diagnose / debug ?

Vincent Vanackere vincent.vanackere at gmail.com
Mon May 8 11:50:58 UTC 2017


On 07/05/2017 23:50, Ilia Mirkin wrote:
> You have two issues:
> 
> (a) nouveau's GL driver messed something up, causing a read fault error
> (b) nouveau's kernel driver tried to recover. It failed.
> 
> Solution to #1: None, really. You can try updating mesa, and hope it
> helps. Not sure what version you're on.

Here's my packages version:

ii  libegl1-mesa:amd64              17.0.3-1ubuntu1                amd64        free implementation of the EGL API -- runtime
ii  libegl1-mesa-dev:amd64          17.0.3-1ubuntu1                amd64        free implementation of the EGL API -- development files
ii  libgl1-mesa-dev:amd64           17.0.3-1ubuntu1                amd64        free implementation of the OpenGL API -- GLX development files
ii  libgl1-mesa-dri:amd64           17.0.3-1ubuntu1                amd64        free implementation of the OpenGL API -- DRI modules
ii  libgl1-mesa-glx:amd64           17.0.3-1ubuntu1                amd64        free implementation of the OpenGL API -- GLX runtime
ii  libglapi-mesa:amd64             17.0.3-1ubuntu1                amd64        free implementation of the GL API -- shared library
ii  libgles2-mesa:amd64             17.0.3-1ubuntu1                amd64        free implementation of the OpenGL|ES 2.x API -- runtime
ii  libglu1-mesa:amd64              9.0.0-2.1build1                amd64        Mesa OpenGL utility library (GLU)
ii  libglu1-mesa-dev:amd64          9.0.0-2.1build1                amd64        Mesa OpenGL utility library -- development files
ii  libwayland-egl1-mesa:amd64      17.0.3-1ubuntu1                amd64        implementation of the Wayland EGL platform -- runtime
ii  mesa-common-dev:amd64           17.0.3-1ubuntu1                amd64        Developer documentation for Mesa
ii  mesa-utils                      8.3.0-4                        amd64        Miscellaneous Mesa GL utilities
ii  mesa-vdpau-drivers:amd64        17.0.3-1ubuntu1                amd64        Mesa VDPAU video acceleration drivers


I'll try compiling a newer version from git to see if it helps...

> Solution to #2: Ben Skeggs will hopefully have something clever to
> say. The recovery logic was recently beefed up considerably, so the
> fact that you even got that far is already a good start.
> 
> If you're looking for a stable experience with Xorg, I recommend using
> xf86-video-nouveau -- it's been extensively battle-tested, and is
> quite simple logic; I also recommend against anything that uses GL on
> an ongoing basis (which, sadly, everyone thinks is the coolest thing
> to do these days). If you're looking for a stable experience with a
> GL-based Wayland compositor, you'll have to wait until either the
> nouveau GL driver is perfect or nouveau kernel module can properly
> recover from any screwups the GL driver makes.

I'm not expecting the GL driver to be perfect ;-) 
However it would be nice if the kernel module could recover at least a bit better from bad commands from the GL driver (indeed I've had some hard lockups too where I could not even connect from ssh).

> You can also remove nouveau_dri.so entirely, which is a big hammer
> against these types of issues (removes all GL-based acceleration), or
> you can run certain key pieces of software with
> LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL
> implementation.

Thanks for the hint, I'll try this workaround too !

Please let me know if I can do anything to improve the drivers's stablility (like dumping the cards's register or enabling some traces ?).
Alternatively if you know of a fanless graphic card model that would be able to drive 2 monitors at 2560x1440 with proper linux support, I'm interested ;-)

Regards

> Cheers,
> 
>   -ilia
> 
> 
> 2017-05-07 16:03 GMT-04:00 Vincent Vanackere <vincent.vanackere at gmail.com>:
>> Hi,
>>
>>  I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at 2560x1440
>> resolution. Using gnome-shell with either Xorg or wayland I get screen
>> freezes very frequently. Those freezes usually require a reboot to get
>> working graphics (below a sample trace that I got yesterday).
>>  I am running Ubuntu 17.04 with the latest kernels avalable, I also tested
>> various more recent kernels including the latest drm tree at
>> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the problem
>> always occurs.
>>  When a freeze occurs, the computer is still reachable through ssh but the
>> only action I found so far to get graphics back is to restart the computer.
>>   I am willing to run diagnostics programs or test any patch if it would
>> help. I'm also not excluding the possibility that I may have some faulty
>> hardware so any hardwae-health-test advice would be welcome...
>>
>> Regards,
>>
>> Vincent Vanackère
>>
>> [    1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1)
>> [    1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10
>> [    1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3
>> [    2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
>> [    2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
>> [    2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
>> [    2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0
>> [    2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
>> [    2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010
>> [    2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 00000000
>> [    2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031
>> [    2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161
>> [    2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200
>> [    2.666199] nouveau 0000:01:00.0: hwmon_device_register() is deprecated.
>> Please convert the driver to use hwmon_device_register_with_info().
>> [    2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
>> [    2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0x60000,
>> bo ffff8cd1499f8000
>> [    3.025200] fbcon: nouveaufb (fb0) is primary device
>> [    3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
>> [    3.268163] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on
>> minor 0
>> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000 engine
>> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000
>> Xwayland[3019]]
>> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed
>> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery
>> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery
>> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed!
>> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2296.863990]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
>> [ 2296.864032]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2296.864047]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2296.864138]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
>> [ 2296.864153]  ? nv84_fence_read+0x2e/0x30 [nouveau]
>> [ 2296.864175]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2296.864189]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2417.699656]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
>> [ 2417.699688]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2417.699705]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2417.699808]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
>> [ 2417.699825]  ? nv84_fence_read+0x2e/0x30 [nouveau]
>> [ 2417.699851]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2417.699867]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2538.535439]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
>> [ 2538.535469]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2538.535485]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work
>> [nouveau]
>> [ 2538.535576]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
>> [ 2538.535591]  ? nv84_fence_read+0x2e/0x30 [nouveau]
>> [ 2538.535614]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
>> [ 2538.535628]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
>>
>>
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/nouveau
>>



More information about the Nouveau mailing list