next/master boot bisection: Oops in nouveau driver on jetson-tk1

Lyude Paul lyude at redhat.com
Sat Dec 8 00:08:40 UTC 2018


uhhhhhhhhhhhhh
didn't we fix this weeks ago? with "drm/nouveau: tegra: Call
nouveau_drm_device_init()"


On Fri, 2018-12-07 at 23:31 +0000, Guillaume Tucker wrote:
> Please find below an automated bisection report for a kernel Oops
> seen during the initialisation of the nouveau GPU driver on
> jetson-tk1.
> 
> 
> All the LAVA test jobs for this bisection can be found here:
> 
>   
> http://lava.baylibre.com:10080/scheduler/alljobs?length=25&search=lava-bisect-staging-7366#table
> 
> 
> Here's the beginning of the Oops stack trace:
> 
> [    7.485361] [00000064] *pgd=f9e7b835
> [    7.485372] Internal error: Oops: 17 [#1] SMP ARM
> [    7.485376] Modules linked in: snd_soc_tegra_rt5640(+)
> snd_soc_tegra_utils snd_soc_rt5640(+) snd_soc_rl6231 snd_soc_tegra30_ahub
> snd_hda_tegra snd_soc_core snd_hda_codec snd_hda_core ac97_bus
> snd_pcm_dmaengine snd_pcm xhci_tegra(+) snd_timer snd soundcore nouveau(+)
> ttm tegra_devfreq tegra_wdt
> [    7.542227] CPU: 1 PID: 128 Comm: udevd Not tainted 4.20.0-rc5-next-
> 20181206 #44
> [    7.549603] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> [    7.555859] PC is at drm_plane_register_all+0x18/0x50
> [    7.560899] LR is at drm_modeset_register_all+0xc/0x6c
> 
> 
> Full log:
> 
>   http://lava.baylibre.com:10080/scheduler/job/68628#L816
> 
> 
> The bisection was run from next-20181206 as this is where the
> issue was discovered on kernelci.org but the patch it found has
> already been merged in mainline.
> 
> Hope this helps!
> 
> Guillaume
> 
> 
> -----------8<------------------------8<-----------
> 
> 
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> Bisection result for next/master (next-20181206) on jetson-tk1
> 
>   Good:       84df9525b0c2 Linux 4.19
>   Bad:        4c92b7b3080d Add linux-next specific files for 20181206
>   Found:      cfea88a4d866 drm/nouveau: Start using new drm_dev
> initialization helpers
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       next
>   URL:        None
>   Branch:     master
>   Target:     jetson-tk1
>   Lab:        lab-baylibre
>   Config:     multi_v7_defconfig
>   Plan:       dmesg-nouveau
> 
> Breaking commit found:
> 
> ----------------------------------------------------------------------------
> ---
> commit cfea88a4d86632f28cf80be97079f131645b7869
> Author: Lyude Paul <lyude at redhat.com>
> Date:   Wed Aug 22 21:40:07 2018 -0400
> 
>     drm/nouveau: Start using new drm_dev initialization helpers
>     
>     Per the documentation in drm_get_pci_dev(), this function is deprecated
>     and shouldn't be used anymore. As it turns out, we're going to need to
>     stop using drm_get_pci_dev() anyway in order to allow us to turn off the
>     card before full system shutdowns, otherwise we'll hit race conditions
>     with userspace while trying to tear down the card on shutdown.
>     
>     So, start using drm_dev_get() and drm_dev_put(), and just turn our
>     load/unload callbacks into open coded init/fini() functions.
>     
>     Signed-off-by: Lyude Paul <lyude at redhat.com>
>     Cc: Karol Herbst <kherbst at redhat.com>
>     Signed-off-by: Ben Skeggs <bskeggs at redhat.com>
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 905956809d21..2b2baf6e0e0d 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -458,75 +458,8 @@ nouveau_accel_init(struct nouveau_drm *drm)
>  	nouveau_bo_move_init(drm);
>  }
>  
> -static int nouveau_drm_probe(struct pci_dev *pdev,
> -			     const struct pci_device_id *pent)
> -{
> -	struct nvkm_device *device;
> -	struct apertures_struct *aper;
> -	bool boot = false;
> -	int ret;
> -
> -	if (vga_switcheroo_client_probe_defer(pdev))
> -		return -EPROBE_DEFER;
> -
> -	/* We need to check that the chipset is supported before booting
> -	 * fbdev off the hardware, as there's no way to put it back.
> -	 */
> -	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
> &device);
> -	if (ret)
> -		return ret;
> -
> -	nvkm_device_del(&device);
> -
> -	/* Remove conflicting drivers (vesafb, efifb etc). */
> -	aper = alloc_apertures(3);
> -	if (!aper)
> -		return -ENOMEM;
> -
> -	aper->ranges[0].base = pci_resource_start(pdev, 1);
> -	aper->ranges[0].size = pci_resource_len(pdev, 1);
> -	aper->count = 1;
> -
> -	if (pci_resource_len(pdev, 2)) {
> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
> -		aper->count++;
> -	}
> -
> -	if (pci_resource_len(pdev, 3)) {
> -		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
> -		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
> -		aper->count++;
> -	}
> -
> -#ifdef CONFIG_X86
> -	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
> -#endif
> -	if (nouveau_modeset != 2)
> -		drm_fb_helper_remove_conflicting_framebuffers(aper,
> "nouveaufb", boot);
> -	kfree(aper);
> -
> -	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
> -				  true, true, ~0ULL, &device);
> -	if (ret)
> -		return ret;
> -
> -	pci_set_master(pdev);
> -
> -	if (nouveau_atomic)
> -		driver_pci.driver_features |= DRIVER_ATOMIC;
> -
> -	ret = drm_get_pci_dev(pdev, pent, &driver_pci);
> -	if (ret) {
> -		nvkm_device_del(&device);
> -		return ret;
> -	}
> -
> -	return 0;
> -}
> -
>  static int
> -nouveau_drm_load(struct drm_device *dev, unsigned long flags)
> +nouveau_drm_device_init(struct drm_device *dev)
>  {
>  	struct nouveau_drm *drm;
>  	int ret;
> @@ -613,7 +546,7 @@ nouveau_drm_load(struct drm_device *dev, unsigned long
> flags)
>  }
>  
>  static void
> -nouveau_drm_unload(struct drm_device *dev)
> +nouveau_drm_device_fini(struct drm_device *dev)
>  {
>  	struct nouveau_drm *drm = nouveau_drm(dev);
>  
> @@ -642,18 +575,116 @@ nouveau_drm_unload(struct drm_device *dev)
>  	kfree(drm);
>  }
>  
> +static int nouveau_drm_probe(struct pci_dev *pdev,
> +			     const struct pci_device_id *pent)
> +{
> +	struct nvkm_device *device;
> +	struct drm_device *drm_dev;
> +	struct apertures_struct *aper;
> +	bool boot = false;
> +	int ret;
> +
> +	if (vga_switcheroo_client_probe_defer(pdev))
> +		return -EPROBE_DEFER;
> +
> +	/* We need to check that the chipset is supported before booting
> +	 * fbdev off the hardware, as there's no way to put it back.
> +	 */
> +	ret = nvkm_device_pci_new(pdev, NULL, "error", true, false, 0,
> &device);
> +	if (ret)
> +		return ret;
> +
> +	nvkm_device_del(&device);
> +
> +	/* Remove conflicting drivers (vesafb, efifb etc). */
> +	aper = alloc_apertures(3);
> +	if (!aper)
> +		return -ENOMEM;
> +
> +	aper->ranges[0].base = pci_resource_start(pdev, 1);
> +	aper->ranges[0].size = pci_resource_len(pdev, 1);
> +	aper->count = 1;
> +
> +	if (pci_resource_len(pdev, 2)) {
> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 2);
> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 2);
> +		aper->count++;
> +	}
> +
> +	if (pci_resource_len(pdev, 3)) {
> +		aper->ranges[aper->count].base = pci_resource_start(pdev, 3);
> +		aper->ranges[aper->count].size = pci_resource_len(pdev, 3);
> +		aper->count++;
> +	}
> +
> +#ifdef CONFIG_X86
> +	boot = pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW;
> +#endif
> +	if (nouveau_modeset != 2)
> +		drm_fb_helper_remove_conflicting_framebuffers(aper,
> "nouveaufb", boot);
> +	kfree(aper);
> +
> +	ret = nvkm_device_pci_new(pdev, nouveau_config, nouveau_debug,
> +				  true, true, ~0ULL, &device);
> +	if (ret)
> +		return ret;
> +
> +	pci_set_master(pdev);
> +
> +	if (nouveau_atomic)
> +		driver_pci.driver_features |= DRIVER_ATOMIC;
> +
> +	drm_dev = drm_dev_alloc(&driver_pci, &pdev->dev);
> +	if (IS_ERR(drm_dev)) {
> +		ret = PTR_ERR(drm_dev);
> +		goto fail_nvkm;
> +	}
> +
> +	ret = pci_enable_device(pdev);
> +	if (ret)
> +		goto fail_drm;
> +
> +	drm_dev->pdev = pdev;
> +	pci_set_drvdata(pdev, drm_dev);
> +
> +	ret = nouveau_drm_device_init(drm_dev);
> +	if (ret)
> +		goto fail_pci;
> +
> +	ret = drm_dev_register(drm_dev, pent->driver_data);
> +	if (ret)
> +		goto fail_drm_dev_init;
> +
> +	return 0;
> +
> +fail_drm_dev_init:
> +	nouveau_drm_device_fini(drm_dev);
> +fail_pci:
> +	pci_disable_device(pdev);
> +fail_drm:
> +	drm_dev_put(drm_dev);
> +fail_nvkm:
> +	nvkm_device_del(&device);
> +	return ret;
> +}
> +
>  void
>  nouveau_drm_device_remove(struct drm_device *dev)
>  {
> +	struct pci_dev *pdev = dev->pdev;
>  	struct nouveau_drm *drm = nouveau_drm(dev);
>  	struct nvkm_client *client;
>  	struct nvkm_device *device;
>  
> +	drm_dev_unregister(dev);
> +
>  	dev->irq_enabled = false;
>  	client = nvxx_client(&drm->client.base);
>  	device = nvkm_device_find(client->device);
> -	drm_put_dev(dev);
>  
> +	nouveau_drm_device_fini(dev);
> +	pci_disable_device(pdev);
> +	drm_dev_put(dev);
>  	nvkm_device_del(&device);
>  }
>  
> @@ -1020,8 +1051,6 @@ driver_stub = {
>  		DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME | DRIVER_RENDER |
>  		DRIVER_KMS_LEGACY_CONTEXT,
>  
> -	.load = nouveau_drm_load,
> -	.unload = nouveau_drm_unload,
>  	.open = nouveau_drm_open,
>  	.postclose = nouveau_drm_postclose,
>  	.lastclose = nouveau_vga_lastclose,
> ----------------------------------------------------------------------------
> ---
> 
> 
> Git bisection log:
> 
> ----------------------------------------------------------------------------
> ---
> git bisect start
> # good: [84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d] Linux 4.19
> git bisect good 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d
> # bad: [4c92b7b3080d8281941ae81c51cd62bb49bdc3d4] Add linux-next specific
> files for 20181206
> git bisect bad 4c92b7b3080d8281941ae81c51cd62bb49bdc3d4
> # bad: [c38239b4be1ac7e4bcf5bbd971353bae51525b8f] Merge branch 'parisc-4.20-
> 2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect bad c38239b4be1ac7e4bcf5bbd971353bae51525b8f
> # good: [d49f8a52b15bf35db778035340d8a673149f9f93] Merge tag 'scsi-misc' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
> git bisect good d49f8a52b15bf35db778035340d8a673149f9f93
> # good: [ac747c0715f29c2be3848b719a1b7e65b07f7b21] Merge tag 'kbuild-v4.20'
> of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
> git bisect good ac747c0715f29c2be3848b719a1b7e65b07f7b21
> # bad: [46972c03ab667dc298cad0c9db517fb9b1521b5f] Merge tag 'drm-misc-next-
> fixes-2018-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-
> next
> git bisect bad 46972c03ab667dc298cad0c9db517fb9b1521b5f
> # good: [6ac99a328ee16d3f8cc253f1df62623cee3e9ea5] drm/exynos: mixer: Make
> plane alpha configurable
> git bisect good 6ac99a328ee16d3f8cc253f1df62623cee3e9ea5
> # good: [0957dc7097a3f462f6cedb45cf9b9785cc29e5bb] drm/amdgpu: revert "stop
> using gart_start as offset for the GTT domain"
> git bisect good 0957dc7097a3f462f6cedb45cf9b9785cc29e5bb
> # good: [2de0b0a158bf423208c3898522c8fa1c1078df48] Merge tag 'drm/tegra/for-
> 4.20-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
> git bisect good 2de0b0a158bf423208c3898522c8fa1c1078df48
> # good: [04b96b63c5640a305e30611def7a9c5fcd7a72cf] drm/msm/dpu: Remove
> unneeded checks in dpu_crtc.c
> git bisect good 04b96b63c5640a305e30611def7a9c5fcd7a72cf
> # good: [6952e3a1dffcb931cf8625aa01642b9afac2af61] Merge branch 'for-
> upstream/mali-dp' of git://linux-arm.org/linux-ld into drm-next
> git bisect good 6952e3a1dffcb931cf8625aa01642b9afac2af61
> # good: [62e681f7dcab746412dce22d4b75b32c5ea38cdb] Merge tag 'drm-msm-fixes-
> 2018-10-09' of git://people.freedesktop.org/~robclark/linux into drm-next
> git bisect good 62e681f7dcab746412dce22d4b75b32c5ea38cdb
> # bad: [7e6191d4360a2df6cf2a2613dcb79680cb943df8] Merge branch 'linux-4.20'
> of git://github.com/skeggsb/linux into drm-next
> git bisect bad 7e6191d4360a2df6cf2a2613dcb79680cb943df8
> # good: [c4cee69a4497d9c6ad8868d63568b30e50cac9e9] drm/nouveau: Fix
> potential memory leak in nouveau_drm_load()
> git bisect good c4cee69a4497d9c6ad8868d63568b30e50cac9e9
> # bad: [a971558c298755d2c07bc5508c65d689471763c8] drm/nouveau/disp: keep
> track of high-speed state, program into clock
> git bisect bad a971558c298755d2c07bc5508c65d689471763c8
> # bad: [4126b99e744b7a29746e201e2be6644d2edf3c56] drm/nouveau/disp: add a
> way to configure scrambling/tmds for hdmi 2.0
> git bisect bad 4126b99e744b7a29746e201e2be6644d2edf3c56
> # bad: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau: Start using
> new drm_dev initialization helpers
> git bisect bad cfea88a4d86632f28cf80be97079f131645b7869
> # first bad commit: [cfea88a4d86632f28cf80be97079f131645b7869] drm/nouveau:
> Start using new drm_dev initialization helpers
> ----------------------------------------------------------------------------
> ---
-- 
Cheers,
	Lyude Paul



More information about the dri-devel mailing list