[PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

Min, Frank Frank.Min at amd.com
Thu Nov 23 04:07:33 UTC 2017


Hi Leo,
Would you please comments on Christian's questions?

Best Regards,
Frank

-----Original Message-----
From: Min, Frank 
Sent: Wednesday, November 22, 2017 4:04 PM
To: Koenig, Christian <Christian.Koenig at amd.com>; amd-gfx at lists.freedesktop.org; Liu, Leo <Leo.Liu at amd.com>
Subject: RE: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

Hi Christian,
Thanks again for your review.

And for the mask change my understanding is it is to be used for mark different part of fw (1<<24 is for stack and 2<<24 is for the data).
And more detailed background would need Leo give us.

Best Regards,
Frank

-----Original Message-----
From: Koenig, Christian
Sent: Wednesday, November 22, 2017 3:57 PM
To: Min, Frank <Frank.Min at amd.com>; amd-gfx at lists.freedesktop.org; Liu, Leo <Leo.Liu at amd.com>
Subject: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)

Hi Frank,

thanks, the patch looks much better now.

> The masks using is to programming the stack and data part for vce fw. And this part of code is borrowed from the non-sriov sequences.

In this case Leo can you explain this strange masks used for the
VCE_VCPU_CACHE_OFFSET* registers?

>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0),
> -					    offset & 0x7FFFFFFF);
> +					offset & ~0x0f000000);
...
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1),
> -					    offset & 0x7FFFFFFF);
> +					(offset & ~0x0f000000) | (1 << 24));

Using ~0x0f000000 looks really odd here and what should the "| (1 << 24)" part be about?

Thanks,
Christian.

Am 22.11.2017 um 06:11 schrieb Min, Frank:
> Hi Christian,
> Patch updated according to your suggestions.
> The masks using is to programming the stack and data part for vce fw. And this part of code is borrowed from the non-sriov sequences.
>
> Best Regards,
> Frank
>
> 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack 
> and date offset
>
> Change-Id: Ic1bc49c21d3a90c477d11162f9d6d9e2073fbbd3
> Signed-off-by: Frank Min <Frank.Min at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 38 +++++++++++++++++++++++------------
>   1 file changed, 25 insertions(+), 13 deletions(-)  mode change
> 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> old mode 100644
> new mode 100755
> index 7574554..024a1be
> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
> @@ -243,37 +243,49 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_VM_CTRL), 0);
>   
>   		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> -						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> -						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>   						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> +						(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 
> +0xff);
>   		} else {
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>   						adev->vce.gpu_addr >> 8);
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
> +						(adev->vce.gpu_addr >> 40) & 0xff);
> +		}
> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
>   						adev->vce.gpu_addr >> 8);
> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
> +						(adev->vce.gpu_addr >> 40) & 0xff);
> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
>   						adev->vce.gpu_addr >> 8);
> -		}
> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
> +						(adev->vce.gpu_addr >> 40) & 0xff);
>   
>   		offset = AMDGPU_VCE_FIRMWARE_OFFSET;
>   		size = VCE_V4_0_FW_SIZE;
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0),
> -					    offset & 0x7FFFFFFF);
> +					offset & ~0x0f000000);
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE0), size);
>   
> -		offset += size;
> +		offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? offset
> ++ size : 0;
>   		size = VCE_V4_0_STACK_SIZE;
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1),
> -					    offset & 0x7FFFFFFF);
> +					(offset & ~0x0f000000) | (1 << 24));
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE1), size);
>   
>   		offset += size;
>   		size = VCE_V4_0_DATA_SIZE;
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET2),
> -					    offset & 0x7FFFFFFF);
> +					(offset & ~0x0f000000) | (2 << 24));
>   		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_VCPU_CACHE_SIZE2), size);
>   
>   		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, 
> mmVCE_LMI_CTRL2), ~0x100, 0);
> --
> 1.9.1
>
> -----邮件原件-----
> 发件人: Christian König [mailto:ckoenig.leichtzumerken at gmail.com]
> 发送时间: 2017年11月21日 22:45
> 收件人: Min, Frank <Frank.Min at amd.com>; amd-gfx at lists.freedesktop.org
> 主题: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2)
>
> Am 21.11.2017 um 11:23 schrieb Frank Min:
>> 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack 
>> and date offset
>>
>> Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056
>> Signed-off-by: Frank Min <Frank.Min at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 97 ++++++++++++++++++++++-------------
>>    1 file changed, 62 insertions(+), 35 deletions(-)
>>    mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> old mode 100644
>> new mode 100755
>> index 7574554..dc7b615
>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
>> @@ -243,59 +243,86 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
>>    		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, 
>> mmVCE_LMI_VM_CTRL), 0);
>>    
>>    		if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>> -						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
>> -						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
>> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>>    						adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8);
>> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
>> +						(adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & 
>> +0xff);
>>    		} else {
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR0),
>>    						adev->vce.gpu_addr >> 8);
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
>> +			MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR0),
>> +						(adev->vce.gpu_addr >> 40) & 0xff);
>> +		}
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR1),
>>    						adev->vce.gpu_addr >> 8);
>> -		    MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR1),
>> +						(adev->vce.gpu_addr >> 40) & 0xff);
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_40BIT_BAR2),
>>    						adev->vce.gpu_addr >> 8);
>> -		}
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +						mmVCE_LMI_VCPU_CACHE_64BIT_BAR2),
>> +						(adev->vce.gpu_addr >> 40) & 0xff);
>>    
>>    		offset = AMDGPU_VCE_FIRMWARE_OFFSET;
>>    		size = VCE_V4_0_FW_SIZE;
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET0),
>> -					    offset & 0x7FFFFFFF);
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE0), size);
>> -
>> -		offset += size;
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_OFFSET0),
>> +					offset & ~0x0f000000);
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_SIZE0), size);
>> +
>> +		offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ?
>> +				offset + size : 0;
>>    		size = VCE_V4_0_STACK_SIZE;
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET1),
>> -					    offset & 0x7FFFFFFF);
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE1), size);
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_OFFSET1),
>> +					(offset & ~0x0f000000) | (1 << 24));
> That mask still looks incorrect to me.
>
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_SIZE1), size);
>>    
>>    		offset += size;
>>    		size = VCE_V4_0_DATA_SIZE;
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_OFFSET2),
>> -					    offset & 0x7FFFFFFF);
>> -		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CACHE_SIZE2), size);
>> -
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_CTRL2), ~0x100, 0);
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_SYS_INT_EN),
>> -						   VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK,
>> -						   VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_OFFSET2),
>> +					(offset & ~0x0f000000) | (2 << 24));
> Dito.
>
>> +		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CACHE_SIZE2), size);
>> +
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_LMI_CTRL2), ~0x100, 0);
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_SYS_INT_EN),
>> +					VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK,
>> +					VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
>>    
>>    		/* end of MC_RESUME */
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS),
>> -						   VCE_STATUS__JOB_BUSY_MASK, ~VCE_STATUS__JOB_BUSY_MASK);
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_VCPU_CNTL),
>> -						   ~0x200001, VCE_VCPU_CNTL__CLK_EN_MASK);
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_SOFT_RESET),
>> -						   ~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, 0);
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_STATUS),
>> +					VCE_STATUS__JOB_BUSY_MASK,
>> +					~VCE_STATUS__JOB_BUSY_MASK);
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_VCPU_CNTL),
>> +					~0x200001,
>> +					VCE_VCPU_CNTL__CLK_EN_MASK);
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_SOFT_RESET),
>> +					~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, 0);
> Unrelated coding style change, please concentrate on the functional change for this patch.
>
>>    
>>    		MMSCH_V1_0_INSERT_DIRECT_POLL(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS),
>> -					      VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK,
>> -					      VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK);
>> +					VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK,
>> +					VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK);
> Here the indentation is wrong. Looks like it was correct before the change.
>
> Regards,
> Christian.
>
>>    
>>    		/* clear BUSY flag */
>> -		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS),
>> -						   ~VCE_STATUS__JOB_BUSY_MASK, 0);
>> +		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0,
>> +					mmVCE_STATUS),
>> +					~VCE_STATUS__JOB_BUSY_MASK, 0);
>>    
>>    		/* add end packet */
>>    		memcpy((void *)init_table, &end, sizeof(struct 
>> mmsch_v1_0_cmd_end));
>



More information about the amd-gfx mailing list