two KASANs in TTM logic

Christian König ckoenig.leichtzumerken at gmail.com
Fri Sep 7 08:59:11 UTC 2018


Hi Ray,

in the meantime can we disable the feature once more in the kernel until 
we have hammered out all possible corner cases?

As Tom figured out commenting out setting "bulk_moveable" to true should 
be enough.

Thanks,
Christian.

Am 07.09.2018 um 08:51 schrieb Huang, Ray:
> Hi Tom,
>
> Thanks to trace this issue.  I am trying to reproduce it on amd-staging-drm-next with piglit.
> May I know the steps/configurations to repro it?
>
> Thanks,
> Ray
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Tom St Denis
> Sent: Wednesday, September 5, 2018 9:27 PM
> To: Koenig, Christian <Christian.Koenig at amd.com>; Daenzer, Michel <Michel.Daenzer at amd.com>; amd-gfx at lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher at amd.com>
> Subject: Re: two KASANs in TTM logic
>
> Logs attached.
>
> Tom
>
>
>
> On 09/05/2018 08:02 AM, Christian König wrote:
>> Still not the slightest idea what is causing this and the patch
>> definitely fixes things a lot.
>>
>> Can you try to enable list debugging in your kernel?
>>
>> Thanks,
>> Christian.
>>
>> Am 04.09.2018 um 19:18 schrieb Tom St Denis:
>>> Sure:
>>>
>>> d2917f399e0b250f47d07da551a335843a24f835 is the first bad commit
>>> commit d2917f399e0b250f47d07da551a335843a24f835
>>> Author: Christian König <christian.koenig at amd.com>
>>> Date:   Thu Aug 30 10:04:53 2018 +0200
>>>
>>>      drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2
>>>
>>>      First step to fix the LRU corruption, we accidentially tried to
>>> move things
>>>      on the LRU after dropping the lock.
>>>
>>>      Signed-off-by: Christian König <christian.koenig at amd.com>
>>>      Tested-by: Michel Dänzer <michel.daenzer at amd.com>
>>>
>>> :040000 040000 ed5be1ad4da129c4154b2b43acf7ef349a470700
>>> 0008c4e2fb56512f41559618dd474c916fc09a37 M      drivers
>>>
>>>
>>> The commit before that I can run xonotic-glx and piglit on my Carrizo
>>> without a KASAN.
>>>
>>> Tom
>>>
>>> On 09/04/2018 10:05 AM, Christian König wrote:
>>>> The first one should already be fixed.
>>>>
>>>> Not sure where the second comes from. Can you narrow that down further?
>>>>
>>>> Christian.
>>>>
>>>> Am 04.09.2018 um 15:46 schrieb Tom St Denis:
>>>>> First is caused by this commit while running a GL heavy application.
>>>>>
>>>>> d78c1fa0c9f815fe951fd57001acca3d35262a17 is the first bad commit
>>>>> commit d78c1fa0c9f815fe951fd57001acca3d35262a17
>>>>> Author: Michel Dänzer <michel.daenzer at amd.com>
>>>>> Date:   Wed Aug 29 11:59:38 2018 +0200
>>>>>
>>>>>      Revert "drm/amdgpu: move PD/PT bos on LRU again"
>>>>>
>>>>>      This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b.
>>>>>
>>>>>      It triggered various badness on my development machine when
>>>>> running the
>>>>>      piglit gpu profile with radeonsi on Bonaire, looks like memory
>>>>>      corruption due to insufficiently protected list manipulations.
>>>>>
>>>>>      Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
>>>>>      Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>>>>
>>>>> :040000 040000 b7169f0cf0c7decec631751a9896a92badb67f9d
>>>>> 42ea58f43199d26fc0c7ddcc655e6d0964b81817 M      drivers
>>>>>
>>>>> The second is caused by something between that and the tip of the
>>>>> 4.19-rc1 amd-staging-drm-next (I haven't pinned it down yet) while
>>>>> loading GNOME.
>>>>>
>>>>> Tom
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list