[Nouveau] GT 730 freeze : how do diagnose / debug ?

Vincent Vanackere vincent.vanackere at gmail.com
Tue May 9 07:50:51 UTC 2017


Some additional data:
- putting LIBGL_ALWAYS_SOFTWARE=1 in /etc/environment makes indeed the
system work (for my current usage, the slowness is acceptable in exchange
of stabillity)
- I still get lock-up using mesa from git (17.2~git1705081930.25d2 from
this repository
https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers)

I have another question (probably Ben Skeggs could also give an advice ?):
I see there are a lot more mesa variables that can be set (
https://www.mesa3d.org/envvars.html). Are there some other variables that I
could set in order to either partially enable hardware acceleration or
(better) to get a diagnostic of what the driver is doing that is causing
the graphic card to hang ?

Thanks for your help !

Vincent

2017-05-08 13:50 GMT+02:00 Vincent Vanackere <vincent.vanackere at gmail.com>:

> On 07/05/2017 23:50, Ilia Mirkin wrote:
> > You have two issues:
> >
> > (a) nouveau's GL driver messed something up, causing a read fault error
> > (b) nouveau's kernel driver tried to recover. It failed.
> >
> > Solution to #1: None, really. You can try updating mesa, and hope it
> > helps. Not sure what version you're on.
>
> Here's my packages version:
>
> ii  libegl1-mesa:amd64              17.0.3-1ubuntu1                amd64
>       free implementation of the EGL API -- runtime
> ii  libegl1-mesa-dev:amd64          17.0.3-1ubuntu1                amd64
>       free implementation of the EGL API -- development files
> ii  libgl1-mesa-dev:amd64           17.0.3-1ubuntu1                amd64
>       free implementation of the OpenGL API -- GLX development files
> ii  libgl1-mesa-dri:amd64           17.0.3-1ubuntu1                amd64
>       free implementation of the OpenGL API -- DRI modules
> ii  libgl1-mesa-glx:amd64           17.0.3-1ubuntu1                amd64
>       free implementation of the OpenGL API -- GLX runtime
> ii  libglapi-mesa:amd64             17.0.3-1ubuntu1                amd64
>       free implementation of the GL API -- shared library
> ii  libgles2-mesa:amd64             17.0.3-1ubuntu1                amd64
>       free implementation of the OpenGL|ES 2.x API -- runtime
> ii  libglu1-mesa:amd64              9.0.0-2.1build1                amd64
>       Mesa OpenGL utility library (GLU)
> ii  libglu1-mesa-dev:amd64          9.0.0-2.1build1                amd64
>       Mesa OpenGL utility library -- development files
> ii  libwayland-egl1-mesa:amd64      17.0.3-1ubuntu1                amd64
>       implementation of the Wayland EGL platform -- runtime
> ii  mesa-common-dev:amd64           17.0.3-1ubuntu1                amd64
>       Developer documentation for Mesa
> ii  mesa-utils                      8.3.0-4                        amd64
>       Miscellaneous Mesa GL utilities
> ii  mesa-vdpau-drivers:amd64        17.0.3-1ubuntu1                amd64
>       Mesa VDPAU video acceleration drivers
>
>
> I'll try compiling a newer version from git to see if it helps...
>
> > Solution to #2: Ben Skeggs will hopefully have something clever to
> > say. The recovery logic was recently beefed up considerably, so the
> > fact that you even got that far is already a good start.
> >
> > If you're looking for a stable experience with Xorg, I recommend using
> > xf86-video-nouveau -- it's been extensively battle-tested, and is
> > quite simple logic; I also recommend against anything that uses GL on
> > an ongoing basis (which, sadly, everyone thinks is the coolest thing
> > to do these days). If you're looking for a stable experience with a
> > GL-based Wayland compositor, you'll have to wait until either the
> > nouveau GL driver is perfect or nouveau kernel module can properly
> > recover from any screwups the GL driver makes.
>
> I'm not expecting the GL driver to be perfect ;-)
> However it would be nice if the kernel module could recover at least a bit
> better from bad commands from the GL driver (indeed I've had some hard
> lockups too where I could not even connect from ssh).
>
> > You can also remove nouveau_dri.so entirely, which is a big hammer
> > against these types of issues (removes all GL-based acceleration), or
> > you can run certain key pieces of software with
> > LIBGL_ALWAYS_SOFTWARE=1, which will force a CPU-based GL
> > implementation.
>
> Thanks for the hint, I'll try this workaround too !
>
> Please let me know if I can do anything to improve the drivers's
> stablility (like dumping the cards's register or enabling some traces ?).
> Alternatively if you know of a fanless graphic card model that would be
> able to drive 2 monitors at 2560x1440 with proper linux support, I'm
> interested ;-)
>
> Regards
>
> > Cheers,
> >
> >   -ilia
> >
> >
> > 2017-05-07 16:03 GMT-04:00 Vincent Vanackere <
> vincent.vanackere at gmail.com>:
> >> Hi,
> >>
> >>  I own an Asus GT730-SL-2GD3-BRK, trying to drive two monitors at
> 2560x1440
> >> resolution. Using gnome-shell with either Xorg or wayland I get screen
> >> freezes very frequently. Those freezes usually require a reboot to get
> >> working graphics (below a sample trace that I got yesterday).
> >>  I am running Ubuntu 17.04 with the latest kernels avalable, I also
> tested
> >> various more recent kernels including the latest drm tree at
> >> https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next but the
> problem
> >> always occurs.
> >>  When a freeze occurs, the computer is still reachable through ssh but
> the
> >> only action I found so far to get graphics back is to restart the
> computer.
> >>   I am willing to run diagnostics programs or test any patch if it would
> >> help. I'm also not excluding the possibility that I may have some faulty
> >> hardware so any hardwae-health-test advice would be welcome...
> >>
> >> Regards,
> >>
> >> Vincent Vanackère
> >>
> >> [    1.199135] nouveau 0000:01:00.0: NVIDIA GK208B (b06070b1)
> >> [    1.319930] nouveau 0000:01:00.0: bios: version 80.28.92.00.10
> >> [    1.322095] nouveau 0000:01:00.0: fb: 2048 MiB DDR3
> >> [    2.620362] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
> >> [    2.620362] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
> >> [    2.620364] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
> >> [    2.620378] nouveau 0000:01:00.0: DRM: DCB version 4.0
> >> [    2.620379] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
> >> [    2.620380] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011f62 00020010
> >> [    2.620380] nouveau 0000:01:00.0: DRM: DCB outp 02: 02022f10 00000000
> >> [    2.620381] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031
> >> [    2.620381] nouveau 0000:01:00.0: DRM: DCB conn 01: 00002161
> >> [    2.620382] nouveau 0000:01:00.0: DRM: DCB conn 02: 00000200
> >> [    2.666199] nouveau 0000:01:00.0: hwmon_device_register() is
> deprecated.
> >> Please convert the driver to use hwmon_device_register_with_info().
> >> [    2.717519] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer
> copies
> >> [    2.992994] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb:
> 0x60000,
> >> bo ffff8cd1499f8000
> >> [    3.025200] fbcon: nouveaufb (fb0) is primary device
> >> [    3.253561] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
> >> [    3.268163] [drm] Initialized nouveau 1.3.1 20120801 for
> 0000:01:00.0 on
> >> minor 0
> >> [ 2150.225651] nouveau 0000:01:00.0: fifo: read fault at 0006710000
> engine
> >> 00 [GR] client 02 [GPC0/PE_0] reason 02 [PTE] on channel 31 [007e8cb000
> >> Xwayland[3019]]
> >> [ 2150.225662] nouveau 0000:01:00.0: fifo: channel 31: killed
> >> [ 2150.225663] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for
> recovery
> >> [ 2150.225666] nouveau 0000:01:00.0: fifo: engine 0: scheduled for
> recovery
> >> [ 2150.225669] nouveau 0000:01:00.0: Xwayland[3019]: channel 31 killed!
> >> [ 2296.863975] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2296.863990]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
> >> [ 2296.864032]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2296.864047]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >> [ 2296.864118] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2296.864138]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
> >> [ 2296.864153]  ? nv84_fence_read+0x2e/0x30 [nouveau]
> >> [ 2296.864175]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2296.864189]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >> [ 2417.699641] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2417.699656]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
> >> [ 2417.699688]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2417.699705]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >> [ 2417.699785] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2417.699808]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
> >> [ 2417.699825]  ? nv84_fence_read+0x2e/0x30 [nouveau]
> >> [ 2417.699851]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2417.699867]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >> [ 2538.535424] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2538.535439]  ? nvkm_ioctl_ntfy_get+0x69/0xb0 [nouveau]
> >> [ 2538.535469]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2538.535485]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >> [ 2538.535555] Workqueue: events_unbound nv50_disp_atomic_commit_work
> >> [nouveau]
> >> [ 2538.535576]  ? nouveau_bo_rd32+0x2a/0x30 [nouveau]
> >> [ 2538.535591]  ? nv84_fence_read+0x2e/0x30 [nouveau]
> >> [ 2538.535614]  nv50_disp_atomic_commit_tail+0x55/0x3a00 [nouveau]
> >> [ 2538.535628]  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
> >>
> >>
> >> _______________________________________________
> >> Nouveau mailing list
> >> Nouveau at lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/nouveau
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20170509/7f0deacb/attachment-0001.html>


More information about the Nouveau mailing list