[Nouveau] nouveau PUSHBUFFER_ERR on 5.9.0-rc2-next-20200824

Alexander Kapshuk alexander.kapshuk at gmail.com
Mon Aug 24 19:08:25 UTC 2020


Since upgrading to linux-next based on 5.9.0-rc1 and 5.9.0-rc2 I have
had my mouse pointer disappear soon after logging in, and I have
observed the system freezing temporarily when clicking on objects and
when typing text.
I have also found records of push buffer errors in dmesg output:
[ 6625.450394] nouveau 0000:01:00.0: disp: ERROR 1 [PUSHBUFFER_ERR] 02
[] chid 0 mthd 0000 data 00000400

I tried setting CONFIG_NOUVEAU_DEBUG=5 (tracing) to try and collect
further debug info, but nothing caught the eye.

The error message in question comes from nv50_disp_intr_error in
drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c:613,645.
And nv50_disp_intr_error is called from nv50_disp_intr in the
following while block:
drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c:647,658
void
nv50_disp_intr(struct nv50_disp *disp)
{
        struct nvkm_device *device = disp->base.engine.subdev.device;
        u32 intr0 = nvkm_rd32(device, 0x610020);
        u32 intr1 = nvkm_rd32(device, 0x610024);

        while (intr0 & 0x001f0000) {
                u32 chid = __ffs(intr0 & 0x001f0000) - 16;
                nv50_disp_intr_error(disp, chid);
                intr0 &= ~(0x00010000 << chid);
        }
...
}

Could this be in any way related to this series of commits?
commit 0a96099691c8cd1ac0744ef30b6846869dc2b566
Author: Ben Skeggs <bskeggs at redhat.com>
Date:   Tue Jul 21 11:34:07 2020 +1000

    drm/nouveau/kms/nv50-: implement proper push buffer control logic

    We had a, what was supposed to be temporary, hack in the KMS code where we'd
    completely drain an EVO/NVD channel's push buffer when wrapping to the start
    again, instead of treating it as a ring buffer.

    Let's fix that, finally.

    Signed-off-by: Ben Skeggs <bskeggs at redhat.com>

Here are my GPU details:
01:00.0 VGA compatible controller: NVIDIA Corporation GT216 [GeForce
210] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 8a93
        Kernel driver in use: nouveau

The last linux-next kernel I built where the problem reported does not
manifest itself is 5.8.0-rc6-next-20200720.

I would appreciate being given any pointers on how to further debug this.
Or is git bisect the only way to proceed with this?

Thanks.


More information about the Nouveau mailing list