[PATCH] drm/amdgpu: fix "Revert "drm/amdgpu: move PD/PT bos on LRU again""

Christian König ckoenig.leichtzumerken at gmail.com
Thu Aug 30 11:43:58 UTC 2018


Am 30.08.2018 um 10:49 schrieb Michel Dänzer:
> On 2018-08-30 10:08 a.m., Christian König wrote:
>> This reverts commit 1156da3d4034957e7927ea68007b981942f5cbd5.
>>
>> We should review reverts as well cause that one only added an incomplete band
>> aided to the problem.
> Sorry about that. I didn't notice any issues with the same testing
> procedure that easily reproduced issues without the revert, so I thought
> it should be at least an improvement.
>
>
>> Correctly disable bulk moves until we have figured out why they corrupt
>> the lists.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 72f8c750e128..4a2d31e45c17 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -283,12 +283,15 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device *adev,
>>   	struct ttm_bo_global *glob = adev->mman.bdev.glob;
>>   	struct amdgpu_vm_bo_base *bo_base;
>>   
>> +	/* TODO: Fix list corruption caused by this */
>> +#if 0
>>   	if (vm->bulk_moveable) {
>>   		spin_lock(&glob->lru_lock);
>>   		ttm_bo_bulk_move_lru_tail(&vm->lru_bulk_move);
>>   		spin_unlock(&glob->lru_lock);
>>   		return;
>>   	}
>> +#endif
> Code should be removed, not #if 0'd.
>
>
> Anyway, with this patch, the attached warning dumps appear in dmesg
> about 1000 times per second at the GDM login prompt, can't even attempt
> to run piglit. Something else is needed, I'm afraid.

AH! And that message shows perfectly what is going wrong here!

Ray tries to move the BOs on the LRU *AFTER* unlocking their reservation 
object.

That also perfectly explains why we get LRU corruption.

Going to get that fixed in a minute,
Christian.

>
> In case it's relevant, note that my development machine has a secondary
> Turks card installed.



More information about the amd-gfx mailing list