[PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
Christian König
ckoenig.leichtzumerken at gmail.com
Tue Oct 16 12:34:37 UTC 2018
Another gentle ping? Ray, Monk can anybody give me an rb for this patch?
It's the last one in this series and I want to get it done.
Christian.
Am 12.10.2018 um 16:27 schrieb Koenig, Christian:
> Great, can I get an rb or acked-by for the patch in this case?
>
> Thanks,
> Christian.
>
> Am 10.10.2018 um 09:52 schrieb Liu, Monk:
>> Thanks Sigil
>>
>> Hi Christian
>>
>> Looks we can enable/disable ctx-switch for SDMA at will, no dependency or conflict on SRIOV
>>
>> /Monk
>>
>> -----Original Message-----
>> From: Ma, Sigil
>> Sent: Wednesday, October 10, 2018 3:25 PM
>> To: Liu, Monk <Monk.Liu at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>
>> Cc: amd-gfx at lists.freedesktop.org
>> Subject: RE: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
>>
>> Hi Monk,
>>
>> AUTO_CTXSW_ENABLE is not relevant to worldswitch preemption. it only applies for ring buffer preemption. SDMA will do worldswitch whatever AUTO_CTXSW_ENABLE is 1 or 0.
>>
>> -----Original Message-----
>> From: Liu, Monk
>> Sent: Wednesday, October 10, 2018 2:54 PM
>> To: Koenig, Christian <Christian.Koenig at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>; Ma, Sigil <Sigil.Ma at amd.com>
>> Cc: amd-gfx at lists.freedesktop.org
>> Subject: RE: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
>>
>> Oh, that mean I remember it reversed way, according to code looks we need to enable ctx_switch to support WORLD SWITCH for SDMA engine
>>
>> But better let Sigil confirm it ...
>>
>> Hi @Ma, Sigil can you confirm it ? what's the relationship between ctx_swich and world swich for SDMA engines ?
>>
>> Ctx_switch_enable() will set "SDMA0/1_CNTL's field: AUTO_CTXSW_ENABLE" to 1, can you tell us what's it for and how it go with SRIOV world switch ?
>>
>> Thanks
>>
>> /Monk
>>
>> -----Original Message-----
>> From: Koenig, Christian
>> Sent: Tuesday, October 9, 2018 9:03 PM
>> To: Liu, Monk <Monk.Liu at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>; Ma, Sigil <Sigil.Ma at amd.com>
>> Cc: amd-gfx at lists.freedesktop.org
>> Subject: Re: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
>>
>> Hi Monk,
>>
>> well that doesn't make much sense to me what you say here cause context switching certainly is already enabled under SRIOV:
>>
>>> - if (amdgpu_sriov_vf(adev)) { /* bare-metal sequence
>>> doesn't need below to lines */
>>> - sdma_v4_0_ctx_switch_enable(adev, true);
>>> - sdma_v4_0_enable(adev, true);
>>> - }
>> The problem is that context switching as well as the gfx ring is enabled for both SDMA0 and SDMA1 without initializing SDMA1.
>>
>> That's most likely causing some unwanted consequences.
>>
>> Christian.
>>
>> Am 09.10.2018 um 13:45 schrieb Liu, Monk:
>>> Context switch is for preemption across different queues (gfx, rlc0/1,
>>> page) under bare-metal environment, For SRIOV we didn't need it and we didn't test it yet, so we just disable it to make life easier, besides since each VF share only 6 MS slice there is in fact no benefit to enable it for SRIOV ...
>>>
>>> + @Ma, Sigil to confirm
>>>
>>> Hi Sigil
>>>
>>> Do you think context switch could be enabled for SRIOV VF ?? I worry that the context switch have internal crush with preemption for world switch , thanks !
>>>
>>> /Monk
>>>
>>> -----Original Message-----
>>> From: Christian König <ckoenig.leichtzumerken at gmail.com>
>>> Sent: Tuesday, October 9, 2018 6:57 PM
>>> To: Huang, Ray <Ray.Huang at amd.com>; Liu, Monk <Monk.Liu at amd.com>; Min,
>>> Frank <Frank.Min at amd.com>
>>> Cc: amd-gfx at lists.freedesktop.org
>>> Subject: Re: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
>>>
>>> Am 09.10.2018 um 11:17 schrieb Huang Rui:
>>>> On Mon, Oct 08, 2018 at 03:35:15PM +0200, Christian König wrote:
>>>>> [SNIP]
>>>>> - if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>>>>> - r = sdma_v4_0_load_microcode(adev);
>>>>> + /* start the gfx rings and rlc compute queues */
>>>>> + for (i = 0; i < adev->sdma.num_instances; i++)
>>>>> + sdma_v4_0_gfx_resume(adev, i);
>>>>> +
>>>>> + if (amdgpu_sriov_vf(adev)) {
>>>>> + sdma_v4_0_ctx_switch_enable(adev, true);
>>>>> + sdma_v4_0_enable(adev, true);
>>>>> + } else {
>>>>> + r = sdma_v4_0_rlc_resume(adev);
>>>>> if (r)
>>>>> return r;
>>>>> }
>>>> + Monk, Frank,
>>>>
>>>> I probably cannot judge here, under SRIOV, I saw you disable ctx
>>>> switch before. Do you have any concern if we enabled it here.
>>> The problem was that those calls where mixed into sdma_v4_0_gfx_resume() for the first SDMA instance.
>>>
>>> What was happening is that SDMA0 was initialized and while doing so enabled both SDMA0 and SDMA1. So SDMA1 was starting up before the ring buffer was even set.
>>>
>>> That this doesn't crashed was pure coincident and is most likely also the reason why we ran into problems when ring buffers weren't initialized.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> Others, looks good for me. Christian, may we know which kind of jobs
>>>> will use sdma page queue(ring), you know, we just sdma gfx queue(ring) before?
>>>>
>>>> Thanks,
>>>> Ray
>>>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list