[Nouveau] [bug report] drm/nouveau/gsp/r535: add support for booting GSP-RM

Dan Carpenter dan.carpenter at linaro.org
Wed Nov 8 08:56:26 UTC 2023


On Tue, Nov 07, 2023 at 08:00:23PM +0100, Danilo Krummrich wrote:
> On 11/7/23 15:34, Dan Carpenter wrote:
> > Hello Ben Skeggs,
> > 
> > The patch 176fdcbddfd2: "drm/nouveau/gsp/r535: add support for
> > booting GSP-RM" from Sep 19, 2023 (linux-next), leads to the
> > following Smatch static checker warning:
> > 
> > 	drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:1016 r535_gsp_rpc_unloading_guest_driver()
> > 	warn: 'rpc' isn't an ERR_PTR
> > 
> > drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c
> >      1010 static int
> >      1011 r535_gsp_rpc_unloading_guest_driver(struct nvkm_gsp *gsp, bool suspend)
> >      1012 {
> >      1013         rpc_unloading_guest_driver_v1F_07 *rpc;
> >      1014
> >      1015         rpc = nvkm_gsp_rpc_get(gsp, NV_VGPU_MSG_FUNCTION_UNLOADING_GUEST_DRIVER, sizeof(*rpc));
> > 
> > nvkm_gsp_rpc_get() returns NULL on error.
> 
> There are also code paths where it can return an ERR_PTR. I think we
> need to check for IS_ERR_OR_NULL()...

When a function returns a mix of error pointers and NULL then generally
NULL means the feature is deliberately disabled.

int blink_leds(void)
{
	leds = get_leds();
	if (IS_ERR(leds)) {
		printk("warning!  LEDS are malfunctioning!\n");
		return PTR_ERR(leds);
	}
	if (!leds)
		return 0;  // <-- do nothing.  success!
	return leds->blink();
}

Normally, it's obvious from the context what the NULL means, but if not
it should be commented, "search_foo() returns error pointers if we had
an allocation error doing the search, NULL means the object wasn't
found, otherwise return a pointer to foo".

I have written a blog on this:
https://staticthinking.wordpress.com/2022/08/01/mixing-error-pointers-and-null/

There are a few places in the kernel where it's like "NULL is a bug, but
we just work around it.  Like we have 50 drivers, and you know some of
those guys are going to mess up, so let's just check for both."  We
have a similar check for EPROBE_DEFER where both negative and positive
values are treated the same even though negative error codes are
correct.  Another place where bugs are probably going to happen is when
we change the function from returning NULL to returning error pointers.
"People are going to do back ports and they're going to mess up so we
know we're going to introduce bugs.  Let's just work around it."

But generally it's better to fix the bugs instead of working around it.

Here it looks like we are just doing a random mix of error pointers and
NULL.

So let's take a look at this bug:
    drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:926
    r535_gsp_rpc_get_gsp_static_info() warn: 'rpc' can also be NULL

nvkm_gsp_rpc_rd() returns a mix.
nvkm_gsp_rpc_push() (r535_gsp_rpc_push()) returns a mix.
r535_gsp_rpc_send() returns a mix.
r535_gsp_msg_recv() returns a mix.

What does the NULL mean?  It needs to be commented.  If it's just bug
work-arounds then they should be done properly.

	if (IS_ERR_OR_NULL(p))
		return p ? ERR_CAST(p): -EIO;

regards,
dan carpenter



More information about the Nouveau mailing list