[PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

Zhang, Jerry(Junwei) Jerry.Zhang at amd.com
Wed Nov 7 08:38:47 UTC 2018


On 11/7/18 3:55 PM, Koenig, Christian wrote:
> Am 07.11.18 um 08:41 schrieb Zhang, Jerry(Junwei):
>> On 11/7/18 3:29 PM, Koenig, Christian wrote:
>>> Hi guys,
>>>
>>> this is necessary for recoverable page fault handling.
>>>
>>> When the normal SDMA queue is blocked because of a page fault the SDMA
>>> firmware will switch to the paging queue so that we are able to handle
>>> the fault.
>> Thanks for your info.
>>
>> IIRC, page queue has higher priority than gfx queue(previously we were
>> using),
>> so the PT update job on page queue will always be scheduled first in HW.
> I think so, but that is not it's primary purpose. The key feature is
> that it still works even when the GFX or RLC queues are blocked because
> of fault handling.

That sounds good functionality.

>
>> And (not 100% sure) page queue is designed for page migration?
> Yes, well it is designed for page tables updates. Either while doing
> migration, fault handling or whatever reason you got.
>
>> Anyway, we can disable it for SRIOV for their existing issues.
> It would be nice to have for normal PD/PT updates under SRIOV as well,
> but as a short term workaround we can probably disable it.

Agree.

Regards,
Jerry

>
> Regards,
> Christian.
>
>> Regards,
>> Jerry
>>
>>> In general it should work on all Vega (but not Raven) components and we
>>> are going to need it when we enable recoverable page faults.
>>>
>>> The only case I can see where we don't immediately need it is SRIOV,
>>> because the current planning is to not support recoverable page faults
>>> there.
>>>
>>> Christian.
>>>
>>> Am 07.11.18 um 08:21 schrieb Liu, Monk:
>>>> Hi team
>>>>
>>>> Why we need this page_queue in amdgpu ?  can anyone share something
>>>> of its introduction to the kmd ?
>>>> According to my understanding , gpu-scheduler already have couple
>>>> levels of priority for contexts/entities , thus the job page_queue
>>>> supposed to do (should be mapping/unmapping/moving) is already good
>>>> took care of by "KERNEL" priority entities, and all other
>>>> context/entity SDMA jobs will be handled after "KERNEL" jobs ...
>>>>
>>>> So there is no real benefit to introduce page_queue (also for
>>>> rlc_queue) to amdgpu with the existence of priority aware
>>>> gpu-scheduler ... unless we are going to remove the "KERNEL"
>>>> priority and always do the mapping/unmapping in page_queue ...
>>>>
>>>> /Monk
>>>>
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of
>>>> Zhang, Jerry(Junwei)
>>>> Sent: Wednesday, November 7, 2018 1:26 PM
>>>> To: Huang, Trigger <Trigger.Huang at amd.com>;
>>>> amd-gfx at lists.freedesktop.org; Deucher, Alexander
>>>> <Alexander.Deucher at amd.com>; Koenig, Christian
>>>> <Christian.Koenig at amd.com>; Kuehling, Felix <Felix.Kuehling at amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF
>>>>
>>>> On 11/7/18 1:15 PM, Trigger Huang wrote:
>>>>> Currently, SDMA page queue is not used under SR-IOV VF, and this queue
>>>>> will cause ring test failure in amdgpu module reload case. So just
>>>>> disable it.
>>>>>
>>>>> Signed-off-by: Trigger Huang <Trigger.Huang at amd.com>
>>>> Looks we ran into several issues about it on vega.
>>>> kfd also disabled vega10 for development.(but not sure the detail
>>>> issue for them)
>>>>
>>>> Thus, we may disable it for vega10 as well?
>>>> any comment? Alex, Christian, Flex.
>>>>
>>>> Regards,
>>>> Jerry
>>>>> ---
>>>>>      drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
>>>>>      1 file changed, 3 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>>> index e39a09eb0f..4edc848 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
>>>>> @@ -1451,7 +1451,9 @@ static int sdma_v4_0_early_init(void *handle)
>>>>>              adev->sdma.has_page_queue = false;
>>>>>          } else {
>>>>>              adev->sdma.num_instances = 2;
>>>>> -        if (adev->asic_type != CHIP_VEGA20 &&
>>>>> +        if ((adev->asic_type == CHIP_VEGA10) &&
>>>>> amdgpu_sriov_vf((adev)))
>>>>> +            adev->sdma.has_page_queue = false;
>>>>> +        else if (adev->asic_type != CHIP_VEGA20 &&
>>>>>                      adev->asic_type != CHIP_VEGA12)
>>>>>                  adev->sdma.has_page_queue = true;
>>>>>          }
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list