[PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
Huang Rui
ray.huang at amd.com
Tue Oct 16 14:42:17 UTC 2018
On Tue, Oct 16, 2018 at 08:34:37PM +0800, Christian König wrote:
> Another gentle ping? Ray, Monk can anybody give me an rb for this patch?
>
> It's the last one in this series and I want to get it done.
>
Feel free to add
Reviewed-by: Huang Rui <ray.huang at amd.com>
> Christian.
>
> Am 12.10.2018 um 16:27 schrieb Koenig, Christian:
> > Great, can I get an rb or acked-by for the patch in this case?
> >
> > Thanks,
> > Christian.
> >
> > Am 10.10.2018 um 09:52 schrieb Liu, Monk:
> >> Thanks Sigil
> >>
> >> Hi Christian
> >>
> >> Looks we can enable/disable ctx-switch for SDMA at will, no dependency or conflict on SRIOV
> >>
> >> /Monk
> >>
> >> -----Original Message-----
> >> From: Ma, Sigil
> >> Sent: Wednesday, October 10, 2018 3:25 PM
> >> To: Liu, Monk <Monk.Liu at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>
> >> Cc: amd-gfx at lists.freedesktop.org
> >> Subject: RE: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
> >>
> >> Hi Monk,
> >>
> >> AUTO_CTXSW_ENABLE is not relevant to worldswitch preemption. it only applies for ring buffer preemption. SDMA will do worldswitch whatever AUTO_CTXSW_ENABLE is 1 or 0.
> >>
> >> -----Original Message-----
> >> From: Liu, Monk
> >> Sent: Wednesday, October 10, 2018 2:54 PM
> >> To: Koenig, Christian <Christian.Koenig at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>; Ma, Sigil <Sigil.Ma at amd.com>
> >> Cc: amd-gfx at lists.freedesktop.org
> >> Subject: RE: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
> >>
> >> Oh, that mean I remember it reversed way, according to code looks we need to enable ctx_switch to support WORLD SWITCH for SDMA engine
> >>
> >> But better let Sigil confirm it ...
> >>
> >> Hi @Ma, Sigil can you confirm it ? what's the relationship between ctx_swich and world swich for SDMA engines ?
> >>
> >> Ctx_switch_enable() will set "SDMA0/1_CNTL's field: AUTO_CTXSW_ENABLE" to 1, can you tell us what's it for and how it go with SRIOV world switch ?
> >>
> >> Thanks
> >>
> >> /Monk
> >>
> >> -----Original Message-----
> >> From: Koenig, Christian
> >> Sent: Tuesday, October 9, 2018 9:03 PM
> >> To: Liu, Monk <Monk.Liu at amd.com>; Huang, Ray <Ray.Huang at amd.com>; Min, Frank <Frank.Min at amd.com>; Ma, Sigil <Sigil.Ma at amd.com>
> >> Cc: amd-gfx at lists.freedesktop.org
> >> Subject: Re: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
> >>
> >> Hi Monk,
> >>
> >> well that doesn't make much sense to me what you say here cause context switching certainly is already enabled under SRIOV:
> >>
> >>> - if (amdgpu_sriov_vf(adev)) { /* bare-metal sequence
> >>> doesn't need below to lines */
> >>> - sdma_v4_0_ctx_switch_enable(adev, true);
> >>> - sdma_v4_0_enable(adev, true);
> >>> - }
> >> The problem is that context switching as well as the gfx ring is enabled for both SDMA0 and SDMA1 without initializing SDMA1.
> >>
> >> That's most likely causing some unwanted consequences.
> >>
> >> Christian.
> >>
> >> Am 09.10.2018 um 13:45 schrieb Liu, Monk:
> >>> Context switch is for preemption across different queues (gfx, rlc0/1,
> >>> page) under bare-metal environment, For SRIOV we didn't need it and we didn't test it yet, so we just disable it to make life easier, besides since each VF share only 6 MS slice there is in fact no benefit to enable it for SRIOV ...
> >>>
> >>> + @Ma, Sigil to confirm
> >>>
> >>> Hi Sigil
> >>>
> >>> Do you think context switch could be enabled for SRIOV VF ?? I worry that the context switch have internal crush with preemption for world switch , thanks !
> >>>
> >>> /Monk
> >>>
> >>> -----Original Message-----
> >>> From: Christian König <ckoenig.leichtzumerken at gmail.com>
> >>> Sent: Tuesday, October 9, 2018 6:57 PM
> >>> To: Huang, Ray <Ray.Huang at amd.com>; Liu, Monk <Monk.Liu at amd.com>; Min,
> >>> Frank <Frank.Min at amd.com>
> >>> Cc: amd-gfx at lists.freedesktop.org
> >>> Subject: Re: [PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV
> >>>
> >>> Am 09.10.2018 um 11:17 schrieb Huang Rui:
> >>>> On Mon, Oct 08, 2018 at 03:35:15PM +0200, Christian König wrote:
> >>>>> [SNIP]
> >>>>> - if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
> >>>>> - r = sdma_v4_0_load_microcode(adev);
> >>>>> + /* start the gfx rings and rlc compute queues */
> >>>>> + for (i = 0; i < adev->sdma.num_instances; i++)
> >>>>> + sdma_v4_0_gfx_resume(adev, i);
> >>>>> +
> >>>>> + if (amdgpu_sriov_vf(adev)) {
> >>>>> + sdma_v4_0_ctx_switch_enable(adev, true);
> >>>>> + sdma_v4_0_enable(adev, true);
> >>>>> + } else {
> >>>>> + r = sdma_v4_0_rlc_resume(adev);
> >>>>> if (r)
> >>>>> return r;
> >>>>> }
> >>>> + Monk, Frank,
> >>>>
> >>>> I probably cannot judge here, under SRIOV, I saw you disable ctx
> >>>> switch before. Do you have any concern if we enabled it here.
> >>> The problem was that those calls where mixed into sdma_v4_0_gfx_resume() for the first SDMA instance.
> >>>
> >>> What was happening is that SDMA0 was initialized and while doing so enabled both SDMA0 and SDMA1. So SDMA1 was starting up before the ring buffer was even set.
> >>>
> >>> That this doesn't crashed was pure coincident and is most likely also the reason why we ran into problems when ring buffers weren't initialized.
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Others, looks good for me. Christian, may we know which kind of jobs
> >>>> will use sdma page queue(ring), you know, we just sdma gfx queue(ring) before?
> >>>>
> >>>> Thanks,
> >>>> Ray
> >>>>
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
More information about the amd-gfx
mailing list