[PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

Koenig, Christian Christian.Koenig at amd.com
Wed Nov 7 07:55:23 UTC 2018


Am 07.11.18 um 08:41 schrieb Zhang, Jerry(Junwei):
> On 11/7/18 3:29 PM, Koenig, Christian wrote:
>> Hi guys,
>>
>> this is necessary for recoverable page fault handling.
>>
>> When the normal SDMA queue is blocked because of a page fault the SDMA
>> firmware will switch to the paging queue so that we are able to handle
>> the fault.
> Thanks for your info.
>
> IIRC, page queue has higher priority than gfx queue(previously we were 
> using),
> so the PT update job on page queue will always be scheduled first in HW.

I think so, but that is not it's primary purpose. The key feature is 
that it still works even when the GFX or RLC queues are blocked because 
of fault handling.

> And (not 100% sure) page queue is designed for page migration?

Yes, well it is designed for page tables updates. Either while doing 
migration, fault handling or whatever reason you got.

> Anyway, we can disable it for SRIOV for their existing issues.

It would be nice to have for normal PD/PT updates under SRIOV as well, 
but as a short term workaround we can probably disable it.

Regards,
Christian.

>
> Regards,
> Jerry
>
>>
>> In general it should work on all Vega (but not Raven) components and we
>> are going to need it when we enable recoverable page faults.
>>
>> The only case I can see where we don't immediately need it is SRIOV,
>> because the current planning is to not support recoverable page faults
>> there.
>>
>> Christian.
>>
>> Am 07.11.18 um 08:21 schrieb Liu, Monk:
>>> Hi team
>>>
>>> Why we need this page_queue in amdgpu ?  can anyone share something 
>>> of its introduction to the kmd ?
>>> According to my understanding , gpu-scheduler already have couple 
>>> levels of priority for contexts/entities , thus the job page_queue 
>>> supposed to do (should be mapping/unmapping/moving) is already good 
>>> took care of by "KERNEL" priority entities, and all other 
>>> context/entity SDMA jobs will be handled after "KERNEL" jobs ...
>>>
>>> So there is no real benefit to introduce page_queue (also for 
>>> rlc_queue) to amdgpu with the existence of priority aware 
>>> gpu-scheduler ... unless we are going to remove the "KERNEL" 
>>> priority and always do the mapping/unmapping in page_queue ...
>>>
>>> /Monk
>>>
>>> -----Original Message-----
>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of 
>>> Zhang, Jerry(Junwei)
>>> Sent: Wednesday, November 7, 2018 1:26 PM
>>> To: Huang, Trigger <Trigger.Huang at amd.com>; 
>>> amd-gfx at lists.freedesktop.org; Deucher, Alexander 
>>> <Alexander.Deucher at amd.com>; Koenig, Christian 
>>> <Christian.Koenig at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF
>>>
>>> On 11/7/18 1:15 PM, Trigger Huang wrote:
>>>> Currently, SDMA page queue is not used under SR-IOV VF, and this queue
>>>> will cause ring test failure in amdgpu module reload case. So just 
>>>> disable it.
>>>>
>>>> Signed-off-by: Trigger Huang <Trigger.Huang at amd.com>
>>> Looks we ran into several issues about it on vega.
>>> kfd also disabled vega10 for development.(but not sure the detail 
>>> issue for them)
>>>
>>> Thus, we may disable it for vega10 as well?
>>> any comment? Alex, Christian, Flex.
>>>
>>> Regards,
>>> Jerry
>>>> ---
>>>>     drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
>>>>     1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>> index e39a09eb0f..4edc848 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>> @@ -1451,7 +1451,9 @@ static int sdma_v4_0_early_init(void *handle)
>>>>             adev->sdma.has_page_queue = false;
>>>>         } else {
>>>>             adev->sdma.num_instances = 2;
>>>> -        if (adev->asic_type != CHIP_VEGA20 &&
>>>> +        if ((adev->asic_type == CHIP_VEGA10) && 
>>>> amdgpu_sriov_vf((adev)))
>>>> +            adev->sdma.has_page_queue = false;
>>>> +        else if (adev->asic_type != CHIP_VEGA20 &&
>>>>                     adev->asic_type != CHIP_VEGA12)
>>>>                 adev->sdma.has_page_queue = true;
>>>>         }
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>



More information about the amd-gfx mailing list