[PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

Paneer Selvam, Arunpravin Arunpravin.PaneerSelvam at amd.com
Sat May 28 07:43:58 UTC 2022


[Public]

Hi,

After investigating quite some time on this issue, found freeze problem is not with the amdgpu part of buddy allocator patch as the patch doesn’t throw any issues when applied separately on top of the stable base of drm-next. After digging more into this issue, the below patch seems to be the cause of this problem,

drm/ttm: rework bulk move handling v5
https://cgit.freedesktop.org/drm/drm/commit/?id=fee2ede155423b0f7a559050a39750b98fe9db69

when this patch applied on top of the stable (working version) of drm-next without buddy allocator patch, we can see multiple issues listed below, each thrown randomly at every GravityMark run, 1. general protection fault at ttm_lru_bulk_move_tail() 2. NULL pointer deference at ttm_lru_bulk_move_tail() 3. NULL pointer deference at ttm_resource_init().

Regards,
Arun.
-----Original Message-----
From: Alex Deucher <alexdeucher at gmail.com> 
Sent: Monday, May 16, 2022 8:36 PM
To: Mike Lothian <mike at fireburn.co.uk>
Cc: Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam at amd.com>; Intel Graphics Development <intel-gfx at lists.freedesktop.org>; amd-gfx list <amd-gfx at lists.freedesktop.org>; Maling list - DRI developers <dri-devel at lists.freedesktop.org>; Deucher, Alexander <Alexander.Deucher at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Matthew Auld <matthew.auld at intel.com>
Subject: Re: [PATCH v12] drm/amdgpu: add drm buddy support to amdgpu

On Mon, May 16, 2022 at 8:40 AM Mike Lothian <mike at fireburn.co.uk> wrote:
>
> Hi
>
> The merge window for 5.19 will probably be opening next week, has 
> there been any progress with this bug?

It took a while to find a combination of GPUs that would repro the issue, but now that we can, it is still being investigated.

Alex

>
> Thanks
>
> Mike
>
> On Mon, 2 May 2022 at 17:31, Mike Lothian <mike at fireburn.co.uk> wrote:
> >
> > On Mon, 2 May 2022 at 16:54, Arunpravin Paneer Selvam 
> > <arunpravin.paneerselvam at amd.com> wrote:
> > >
> > >
> > >
> > > On 5/2/2022 8:41 PM, Mike Lothian wrote:
> > > > On Wed, 27 Apr 2022 at 12:55, Mike Lothian <mike at fireburn.co.uk> wrote:
> > > >> On Tue, 26 Apr 2022 at 17:36, Christian König <christian.koenig at amd.com> wrote:
> > > >>> Hi Mike,
> > > >>>
> > > >>> sounds like somehow stitching together the SG table for PRIME 
> > > >>> doesn't work any more with this patch.
> > > >>>
> > > >>> Can you try with P2P DMA disabled?
> > > >> -CONFIG_PCI_P2PDMA=y
> > > >> +# CONFIG_PCI_P2PDMA is not set
> > > >>
> > > >> If that's what you're meaning, then there's no difference, I'll 
> > > >> upload my dmesg to the gitlab issue
> > > >>
> > > >>> Apart from that can you take a look Arun?
> > > >>>
> > > >>> Thanks,
> > > >>> Christian.
> > > > Hi
> > > >
> > > > Have you had any success in replicating this?
> > > Hi Mike,
> > > I couldn't replicate on my Raven APU machine. I see you have 2 
> > > cards initialized, one is Renoir and the other is Navy Flounder. 
> > > Could you give some more details, are you running Gravity Mark on 
> > > Renoir and what is your system RAM configuration?
> > > >
> > > > Cheers
> > > >
> > > > Mike
> > >
> > Hi
> >
> > It's a PRIME laptop, it failed on the RENOIR too, it caused a 
> > lockup, but systemd managed to capture it, I'll attach it to the 
> > issue
> >
> > I've got 64GB RAM, the 6800M has 12GB VRAM
> >
> > Cheers
> >
> > Mike


More information about the amd-gfx mailing list