[Nouveau] 4.16-rc1: UBSAN warning in nouveau/nvkm/subdev/therm/base.c + oops in nvkm_therm_clkgate_fini

Lyude Paul lyude at redhat.com
Wed Feb 14 19:11:04 UTC 2018


Actually this was brought up to me already, there's a fix on the mailing list
for this I reviewed a little while ago from nvidia that we should pull in:

https://patchwork.freedesktop.org/patch/203205/

Would you guys mind confirming that this patch fixes your issues?

On Wed, 2018-02-14 at 18:41 +0100, Pierre Moreau wrote:
> On 2018-02-14 — 09:36, Ilia Mirkin wrote:
> > On Wed, Feb 14, 2018 at 9:35 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> > > On Wed, Feb 14, 2018 at 9:29 AM, Meelis Roos <mroos at linux.ee> wrote:
> > > > > This is 4.16-rc1+todays git on a lowly P4 with NV5, worked fine in
> > > > > 4.15:
> > > > 
> > > > NV5 in another PC (secondary card in x86-64) made the systrem crash on
> > > > boot, in nvkm_therm_clkgate_fini.
> > > 
> > > Mind booting with nouveau.debug=trace? That should hopefully tell us
> > > more exactly which thing is dying. If you have a cross-compile/distcc
> > > setup handy, a bisect may be even more useful.
> > 
> > Erm, sorry, nevermind. You even said it -- nvkm_therm_clkgate_fini is
> > somehow mis-hooked up for NV5 now. A bisect result would still make
> > the culprit a lot more obvious.
> 
> CC’ing Lyude Paul as she hooked up the clockgating support.
> 
> Looking at the code, only NV40+ do have a therm engine. Therefore, shouldn’t
> nvkm_therm_clkgate_enable(), nvkm_therm_clkgate_fini() and
> nvkm_therm_clkgate_oneinit() all check for therm being not NULL, on top of
> their check for the clkgate_* hooks being there? Or instead, maybe have the
> check in nvkm_device_init() nvkm_device_init()?
> 
> Pierre
-- 
Cheers,
	Lyude Paul


More information about the Nouveau mailing list