[PATCH] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

Fri Jan 12 06:51:57 UTC 2024

Arun and I have been working on a patch to split up 
AMDGPU_VA_RESERVED_SIZE into two separate defines.

I assumed that Arun landed this by now, but that doesn't seem to be the 
case.

Going to take another look, but yeah the CSA area isn't large enough to 
fill the upper reserved area so moving the TMA/TBA there should work.

Regards,
Christian.

Am 11.01.24 um 23:41 schrieb Felix Kuehling:
> [+Christian]
>
> I'm looking into virtual address reservations in amdgpu and what's 
> reported by AMDGPU_INFO_DEV_INFO. So far I only found 
> AMDGPU_VA_RESERVED_SIZE, which is reserved both at the start of the 
> lower virtual address range and the end of the upper virtual address 
> range.
>
> The reservation size is currently 2MB. The upper reservation is used 
> by the CSA, which is currently 128KB. That leaves plenty of room for 
> the TMA/TBA above the CSA. I'll create a patch for that.
>
> Regards,
>  Felix
>
>
> On 2024-01-10 12:45, Marek Olšák wrote:
>> It looks like this would cause failures even with regular 64-bit
>> allocations because the virtual address range allocator in libdrm asks
>> the kernel what ranges of addresses are free, and the kernel doesn't
>> exclude the KFD allocation from that.
>>
>> Basically, no VM allocations can be done by the kernel outside the
>> ranges reserved for the kernel.
>>
>> Marek
>>
>> On Sat, Jan 6, 2024 at 1:48 AM Marek Olšák <maraeo at gmail.com> wrote:
>>> The 32-bit address space means the high 32 bits are constant and 
>>> predetermined and it's definitely somewhere in the upper range of 
>>> the address space. If ROCm or KFD occupy that space, even 
>>> accidentally, other UMDs that use libdrm for VA allocation won't be 
>>> able to start. The VA range allocator is in libdrm.
>>>
>>> Marek
>>>
>>> On Fri, Jan 5, 2024, 15:20 Felix Kuehling <felix.kuehling at amd.com> 
>>> wrote:
>>>> TBA/TMA were relocated to the upper half of the canonical address 
>>>> space.
>>>> I don't think that qualifies as 32-bit by definition. But maybe you're
>>>> using a different definition.
>>>>
>>>> That said, if Mesa manages its own virtual address space in user mode,
>>>> and KFD maps the TMA/TBA at an address that Mesa believes to be 
>>>> free, I
>>>> can see how that would lead to problems.
>>>>
>>>> That said, the fence refcount bug is another problem that may have 
>>>> been
>>>> exposed by the way that a crashing Mesa application shuts down.
>>>> Reverting Jay's patch certainly didn't fix that, but only hides the 
>>>> problem.
>>>>
>>>> Regards,
>>>>     Felix
>>>>
>>>>
>>>> On 2024-01-04 13:29, Marek Olšák wrote:
>>>>> Hi,
>>>>>
>>>>> I have received information that the original commit makes all 32-bit
>>>>> userspace VA allocations fail, so UMDs like Mesa can't even 
>>>>> initialize
>>>>> and they either crash or fail to load. If TBA/TMA was relocated to 
>>>>> the
>>>>> 32-bit address range, it would explain why UMDs can't allocate
>>>>> anything in that range.
>>>>>
>>>>> Marek
>>>>>
>>>>> On Wed, Jan 3, 2024 at 2:50 PM Jay Cornwall <jay.cornwall at amd.com> 
>>>>> wrote:
>>>>>> On 1/3/2024 12:58, Felix Kuehling wrote:
>>>>>>
>>>>>>> A segfault in Mesa seems to be a different issue from what's 
>>>>>>> mentioned
>>>>>>> in the commit message. I'd let Christian or Marek comment on
>>>>>>> compatibility with graphics UMDs. I'm not sure why this patch would
>>>>>>> affect them at all.
>>>>>> I was referencing this issue in OpenCL/OpenGL interop, which 
>>>>>> certainly looked related:
>>>>>>
>>>>>> [   91.769002] amdgpu 0000:0a:00.0: amdgpu: bo 000000009bba4692 
>>>>>> va 0x0800000000-0x08000001ff conflict with 0x0800000000-0x0800000002
>>>>>> [   91.769141] ocltst[2781]: segfault at b2 ip 00007f3fb90a7c39 
>>>>>> sp 00007ffd3c011ba0 error 4 in 
>>>>>> radeonsi_dri.so[7f3fb888e000+1196000] likely on CPU 15 (core 7, 
>>>>>> socket 0)
>>>>>>
>>>>>>> Looking at the logs in the tickets, it looks like a fence reference
>>>>>>> counting error. I don't see how Jay's patch could have caused 
>>>>>>> that. I
>>>>>>> made another change in that code recently that could make a 
>>>>>>> difference
>>>>>>> for this issue:
>>>>>>>
>>>>>>>       commit 8f08c5b24ced1be7eb49692e4816c1916233c79b
>>>>>>>       Author: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>>>>       Date:   Fri Oct 27 18:21:55 2023 -0400
>>>>>>>
>>>>>>>            drm/amdkfd: Run restore_workers on freezable WQs
>>>>>>>
>>>>>>>            Make restore workers freezable so we don't have to 
>>>>>>> explicitly
>>>>>>>       flush them
>>>>>>>            in suspend and GPU reset code paths, and we don't 
>>>>>>> accidentally
>>>>>>>       try to
>>>>>>>            restore BOs while the GPU is suspended. Not having to 
>>>>>>> flush
>>>>>>>       restore_work
>>>>>>>            also helps avoid lock/fence dependencies in the GPU 
>>>>>>> reset case
>>>>>>>       where we're
>>>>>>>            not allowed to wait for fences.
>>>>>>>
>>>>>>>            A side effect of this is, that we can now have multiple
>>>>>>>       concurrent threads
>>>>>>>            trying to signal the same eviction fence. Rework 
>>>>>>> eviction fence
>>>>>>>       signaling
>>>>>>>            and replacement to account for that.
>>>>>>>
>>>>>>>            The GPU reset path can no longer rely on 
>>>>>>> restore_process_worker
>>>>>>>       to resume
>>>>>>>            queues because evict/restore workers can run 
>>>>>>> independently of
>>>>>>>       it. Instead
>>>>>>>            call a new restore_process_helper directly.
>>>>>>>
>>>>>>>            This is an RFC and request for testing.
>>>>>>>
>>>>>>>            v2:
>>>>>>>            - Reworked eviction fence signaling
>>>>>>>            - Introduced restore_process_helper
>>>>>>>
>>>>>>>            v3:
>>>>>>>            - Handle unsignaled eviction fences in 
>>>>>>> restore_process_bos
>>>>>>>
>>>>>>>            Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>>>>            Acked-by: Christian König <christian.koenig at amd.com>
>>>>>>>            Tested-by: Emily Deng <Emily.Deng at amd.com>
>>>>>>>            Signed-off-by: Alex Deucher <alexander.deucher at amd.com>
>>>>>>>
>>>>>>>
>>>>>>> FWIW, I built a plain 6.6 kernel, and was not able to reproduce the
>>>>>>> crash with some simple tests.
>>>>>>>
>>>>>>> Regards,
>>>>>>>      Felix
>>>>>>>
>>>>>>>
>>>>>>>> So I agree, let's revert it.
>>>>>>>>
>>>>>>>> Reviewed-by: Jay Cornwall <jay.cornwall at amd.com>