RE: 回复: [PATCH 3/3] drm/amdgpu: fix colliding of preemption

Fernlund, Hans Hans.Fernlund at amd.com
Tue Feb 18 14:48:42 UTC 2020


[AMD Official Use Only - Internal Distribution Only]

At CP level, MCBP and SRIOV are not mutually exclusive. They are used together in some environments.

/Hans


-----Original Message-----
From: Liu, Monk <Monk.Liu at amd.com> 
Sent: Tuesday, February 18, 2020 7:53 AM
To: Koenig, Christian <Christian.Koenig at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; amd-gfx at lists.freedesktop.org; Fernlund, Hans <Hans.Fernlund at amd.com>
Subject: 回复: 回复: [PATCH 3/3] drm/amdgpu: fix colliding of preemption

If OS preemption can work with SRIOV world switch I won't provide such patch, I would already clean up those MCBP checking ... 

To your question: I don't know details, but my guess is that CP use the same interface with KMD (e.g. CE/DE meta) to handle two MCBP type: world switch and OS preempt, But some implementation is different inside CP firmware by the given interface, that need @Fernlund, Hans to provide the details if you want an expertise, he is the author of CP preemption 

Thanks 

-----邮件原件-----
发件人: Christian König <ckoenig.leichtzumerken at gmail.com>
发送时间: 2020年2月18日 20:40
收件人: Liu, Monk <Monk.Liu at amd.com>; Zhang, Hawking <Hawking.Zhang at amd.com>; amd-gfx at lists.freedesktop.org
主题: Re: 回复: [PATCH 3/3] drm/amdgpu: fix colliding of preemption

Hawking is right here. We could just check amdgpu_mcbp during device initialization and forcefully clear it under SRIOV.

But why is MCBP and SRIOV mutual exclusive? We are certainly getting the requirement for this sooner or later.

Regards,
Christian.

Am 18.02.20 um 13:29 schrieb Liu, Monk:
> Even not talking about CE/DE meta and SDMA CS, we still cannot share amdgpu_mcbp with SRIOV case,  e.g.:
> In some place we use "if (amdgpu_mcbp || amdgpu_sriov_vf()" to check, 
> and we do the same thing under that condition, But we cannot do that 
> thing by "if (amdgpu_mcbp)" and set "amdgpu_mcbp" to true under SRIOV 
> case. Because that will let OS preemption work, which will trigger 
> mismatch/crush in CPG side during the MCBP. (not sure if CWSR has such
> colliding)
>
> So we always need to disable OS preemption behavior for SRIOV (don't 
> to preempt an IB by touch vmid_preempt register, and no IB resume
> happen)
>
> Thanks
>
>
> -----邮件原件-----
> 发件人: Zhang, Hawking <Hawking.Zhang at amd.com>
> 发送时间: 2020年2月18日 19:48
> 收件人: Zhang, Hawking <Hawking.Zhang at amd.com>; Liu, Monk 
> <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
> 抄送: Liu, Monk <Monk.Liu at amd.com>
> 主题: RE: [PATCH 3/3] drm/amdgpu: fix colliding of preemption
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Ahhh.... Send it too quickly. Of course, we still need to apply vf check for ce/de-meta, but I think in such way, we can dramatically reduce the amdgpu_sirov_vf check in every mcbp code path.
>
> Regards,
> Hawking
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of 
> Zhang, Hawking
> Sent: Tuesday, February 18, 2020 19:32
> To: Liu, Monk <Monk.Liu at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Liu, Monk <Monk.Liu at amd.com>
> Subject: RE: [PATCH 3/3] drm/amdgpu: fix colliding of preemption
>
> [AMD Official Use Only - Internal Distribution Only]
>
> It's some kind of annoying to check vf in every place that is required for mcbp until amdgpu_mcbp is enabled by default. What's more, when amdgpu_mcbp is enabled by default, there will be many unnecessary vf check that can be removed as most of mcbp function actually can be shared between world switch preemption and os preemption.
>
> I'd prefer to enable amdgpu_mcbp for sriov in amdgpu_device_check_arguments to reduce the vf specific check everywhere.
>
> Regards,
> Hawking
>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of 
> Monk Liu
> Sent: Tuesday, February 18, 2020 10:54
> To: amd-gfx at lists.freedesktop.org
> Cc: Liu, Monk <Monk.Liu at amd.com>
> Subject: [PATCH 3/3] drm/amdgpu: fix colliding of preemption
>
> what:
> some os preemption path is messed up with world switch preemption
>
> fix:
> cleanup those logics so os preemption not mixed with world switch
>
> this patch is a general fix for issues comes from SRIOV MCBP, but there is still UMD side issues not resovlved yet, so this patch cannot fix all world switch bug.
>
> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 3 ++-
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   | 8 ++++----
>   2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index a2ee30b..7854c05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -70,7 +70,8 @@ uint64_t amdgpu_sdma_get_csa_mc_addr(struct amdgpu_ring *ring,
>   	uint32_t index = 0;
>   	int r;
>   
> -	if (vmid == 0 || !amdgpu_mcbp)
> +	/* don't enable OS preemption on SDMA under SRIOV */
> +	if (amdgpu_sriov_vf(adev) || vmid == 0 || !amdgpu_mcbp)
>   		return 0;
>   
>   	r = amdgpu_sdma_get_index_from_ring(ring, &index); diff --git 
> a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 5e9fb09..7b61583 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -4413,7 +4413,7 @@ static void gfx_v10_0_ring_emit_ib_gfx(struct 
> amdgpu_ring *ring,
>   
>   	control |= ib->length_dw | (vmid << 24);
>   
> -	if (amdgpu_mcbp && (ib->flags & AMDGPU_IB_FLAG_PREEMPT)) {
> +	if ((amdgpu_sriov_vf(ring->adev) || amdgpu_mcbp) && (ib->flags &
> +AMDGPU_IB_FLAG_PREEMPT)) {
>   		control |= INDIRECT_BUFFER_PRE_ENB(1);
>   
>   		if (flags & AMDGPU_IB_PREEMPTED)
> @@ -4421,7 +4421,7 @@ static void gfx_v10_0_ring_emit_ib_gfx(struct 
> amdgpu_ring *ring,
>   
>   		if (!(ib->flags & AMDGPU_IB_FLAG_CE))
>   			gfx_v10_0_ring_emit_de_meta(ring,
> -				    flags & AMDGPU_IB_PREEMPTED ? true : false);
> +				    (!amdgpu_sriov_vf(ring->adev) && flags & AMDGPU_IB_PREEMPTED) ?
> +true : false);
>   	}
>   
>   	amdgpu_ring_write(ring, header);
> @@ -4569,9 +4569,9 @@ static void gfx_v10_0_ring_emit_cntxcntl(struct amdgpu_ring *ring,  {
>   	uint32_t dw2 = 0;
>   
> -	if (amdgpu_mcbp)
> +	if (amdgpu_mcbp || amdgpu_sriov_vf(ring->adev))
>   		gfx_v10_0_ring_emit_ce_meta(ring,
> -				    flags & AMDGPU_IB_PREEMPTED ? true : false);
> +				    (!amdgpu_sriov_vf(ring->adev) && flags & AMDGPU_IB_PREEMPTED) ?
> +true : false);
>   
>   	dw2 |= 0x80000000; /* set load_enable otherwise this package is just NOPs */
>   	if (flags & AMDGPU_HAVE_CTX_SWITCH) {
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CMo
> nk.Liu%40amd.com%7C86ae6b396abe40ba179608d7b46facf3%7C3dd8961fe4884e60
> 8e11a82d994e183d%7C0%7C0%7C637176264049018414&sdata=LYwxBmSyfmH3ep
> uYuOQ1pdp1IL6QlTA30i8By27iDhg%3D&reserved=0
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CMo
> nk.Liu%40amd.com%7C86ae6b396abe40ba179608d7b46facf3%7C3dd8961fe4884e60
> 8e11a82d994e183d%7C0%7C0%7C637176264049018414&sdata=LYwxBmSyfmH3ep
> uYuOQ1pdp1IL6QlTA30i8By27iDhg%3D&reserved=0
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CMo
> nk.Liu%40amd.com%7C86ae6b396abe40ba179608d7b46facf3%7C3dd8961fe4884e60
> 8e11a82d994e183d%7C0%7C0%7C637176264049018414&sdata=LYwxBmSyfmH3ep
> uYuOQ1pdp1IL6QlTA30i8By27iDhg%3D&reserved=0


More information about the amd-gfx mailing list