[PATCH 1/8] drm/amdgpu: fix ATC handling for Ryzen

Wed Apr 3 21:41:38 UTC 2019

On 2019-04-03 1:24 p.m., Koenig, Christian wrote:
> Am 01.04.19 um 20:58 schrieb Kuehling, Felix:
>> On 2019-04-01 2:03 p.m., Christian König wrote:
>>> Am 01.04.19 um 19:59 schrieb Kuehling, Felix:
>>>> On 2019-04-01 7:23 a.m., Christian König wrote:
>>>>> Am 30.03.19 um 01:41 schrieb Kuehling, Felix:
>>>>>> Patches 1-3 are Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>> Thanks.
>>>>>
>>>>>> About the direct mode, that removes a bunch of synchronization, so it
>>>>>> must make some assumptions about the state of the page tables. What
>>>>>> makes that safe?
>>>>> Direct mode is only supposed to be used during page fault handling.
>>>>>
>>>>> E.g. we know that the page tables are in the correct place in this
>>>>> situation because the hardware is hammering on a PTE and waiting for
>>>>> it to become valid.
>>>> A fence could also indicate a concurrent modification of the page table.
>>>> For example a PTB may be allocated and initialized concurrently, not in
>>>> direct mode. Would direct mode need to wait for a fence that indicates
>>>> completion of the PTB initialization? Or do we have some way to ensure
>>>> such concurrent allocation and initialization of a PTB cannot happen?
>>> Yeah, that is a very good question I haven't solved yet either.
>>>
>>> My currently best idea is to separate the address space, e.g. use the
>>> lower address space for on demand paging and the higher with classic
>>> pre-filled page tables for the MM and display engines.
>> That may work for graphics, but doesn't work for KFD. I need the ability
>> to mix pre-filled page tables with HMM in the same SVM address space.
> Even after thinking for multiple days about it I can't of hand find a
> way to make this work.
>
>> That's why I was thinking that all page table updates for a given VM
>> would need to use the same method.
> Well what exactly do you mean with that? Essentially there are two methods:
>
> 1. Pre-fill the page tables before accessing them with the hardware.
>
> 2. Fill on demand with page faults.
>
> I don't think we can mix those two methods together in the same address
> range.

That's what I was hoping to do. For example an application could use 
"old" BO-based memory management APIs that pre-fill page tables with 
"new" HMM-based memory management APIs that rely on page faults. Those 
may be different libraries written in different languages running in the 
same application. E.g. a GPU BLAS implementation that's optimized and 
uses old-style memory allocations linked to an OpenMP application that 
relies on HMM.

If that's not possible, I'd need to emulate all the old memory APIs on 
top of HMM. I was hoping to avoid that.

Even when page faults are enabled, we want to be able to pre-fault stuff 
to avoid the performance it on the first access. Are you saying that 
won't be possible?

Regards,
   Felix

>
> E.g. we can say to use pre-fill for MM engines in the upper range and on
> demand filling in the lower range, but we can't mix them.
>
> Regards,
> Christian.
>
>> Regards,
>>      Felix
>>
>>> Christian.
>>>
>>>> Regards,
>>>>       Felix
>>>>
>>>>
>>>>> Christian.
>>>>>
>>>>>>      Is it safe to use direct-mode on a
>>>>>> per-page-table-update basis? Or do all page table updates have to go
>>>>>> through direct mode to avoid hazards? If yes, then maybe this
>>>>>> should be
>>>>>> a property of the VM rather than a parameter that gets passed to a
>>>>>> bunch
>>>>>> of function calls.
>>>>>>
>>>>>> Regards,
>>>>>>        Felix
>>>>>>
>>>>>> On 2019-03-29 6:45 a.m., Christian König wrote:
>>>>>>> Otherwise we don't correctly use translate further.
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>>>> ---
>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++++-----
>>>>>>>       1 file changed, 8 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> index 3d221f044183..059d9802e713 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> @@ -767,14 +767,17 @@ static int amdgpu_vm_clear_bo(struct
>>>>>>> amdgpu_device *adev,
>>>>>>>              addr = 0;
>>>>>>>           if (ats_entries) {
>>>>>>> -        uint64_t ats_value;
>>>>>>> +        uint64_t value = 0, flags;
>>>>>>>       -        ats_value = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>> -        if (level != AMDGPU_VM_PTB)
>>>>>>> -            ats_value |= AMDGPU_PDE_PTE;
>>>>>>> +        flags = AMDGPU_PTE_DEFAULT_ATC;
>>>>>>> +        if (level != AMDGPU_VM_PTB) {
>>>>>>> +            /* Handle leaf PDEs as PTEs */
>>>>>>> +            flags |= AMDGPU_PDE_PTE;
>>>>>>> +            amdgpu_gmc_get_vm_pde(adev, level, &value, &flags);
>>>>>>> +        }
>>>>>>>                  r = vm->update_funcs->update(&params, bo, addr, 0,
>>>>>>> ats_entries,
>>>>>>> -                         0, ats_value);
>>>>>>> +                         value, flags);
>>>>>>>               if (r)
>>>>>>>                   return r;
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx