<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">
<p style="margin-top:0;margin-bottom:0">Hi Felix,</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">We did test on both China team and Makham team. Also Embedded team did the test also on release 18.20 for Raven.</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Please let <font size="2"><span style="font-size:11pt;">ROCm CQE</span></font> team issue a JIRA ticket and the detail reproduce step.<br>
</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<div id="Signature">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Thanks & Best Regards!</p>
<p><br>
</p>
<p>James Zhu<br>
</p>
</div>
</div>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <felix.kuehling@amd.com><br>
<b>Sent:</b> Friday, August 17, 2018 7:25:53 PM<br>
<b>To:</b> James Zhu; amd-gfx@lists.freedesktop.org<br>
<b>Cc:</b> Deucher, Alexander; Gao, Likun; Zhu, James; Huang, Ray<br>
<b>Subject:</b> Re: [PATCH v2 5/5] drm/amdgpu:add VCN booting with firmware loaded by PSP</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">ROCm CQE is seeing what looks like hangs during amdgpu initialization on<br>
Raven and Vega20. Amdgpu basically stops printing messages while trying<br>
to load VCN firmware. It never completes initialization, but there is no<br>
obvious error message. These are the last messages from amdgpu in the log:<br>
<br>
[ 1.282661] [drm] Found VCN firmware Version: 1.24 Family ID: 18<br>
[ 1.282664] [drm] PSP loading VCN firmware<br>
[ 1.303164] [drm] reserve 0x400000 from 0xf400e00000 for PSP TMR SIZE<br>
<br>
Any applications trying to use /dev/dri/* hang with a backtrace like below.<br>
<br>
Was this change expected to affect Raven and Vega20? Has it been tested<br>
before submitting? Do we need updated VCN firmware for it to work?<br>
<br>
Thanks,<br>
Felix<br>
<br>
[ 363.352985] INFO: task gpu-manager:937 blocked for more than 120 seconds.<br>
[ 363.352995] Not tainted 4.18.0-rc1-kfd-compute-roc-master-8912 #1<br>
[ 363.352999] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
[ 363.353004] gpu-manager D 0 937 1 0x00000000<br>
[ 363.353008] Call Trace:<br>
[ 363.353018] ? __schedule+0x3d9/0x8b0<br>
[ 363.353023] schedule+0x32/0x80<br>
[ 363.353026] schedule_preempt_disabled+0xa/0x10<br>
[ 363.353028] __mutex_lock.isra.4+0x2ae/0x4e0<br>
[ 363.353031] ? _cond_resched+0x16/0x40<br>
[ 363.353048] ? drm_stub_open+0x2e/0x100 [drm]<br>
[ 363.353063] drm_stub_open+0x2e/0x100 [drm]<br>
[ 363.353069] chrdev_open+0xbe/0x1a0<br>
[ 363.353072] ? cdev_put+0x20/0x20<br>
[ 363.353075] do_dentry_open+0x1e2/0x300<br>
[ 363.353078] path_openat+0x2b4/0x14b0<br>
[ 363.353082] ? vsnprintf+0x230/0x4c0<br>
[ 363.353086] ? __alloc_pages_nodemask+0x100/0x290<br>
[ 363.353088] do_filp_open+0x99/0x110<br>
[ 363.353092] ? generic_update_time+0x6a/0xc0<br>
[ 363.353094] ? touch_atime+0xc1/0xd0<br>
[ 363.353096] ? _cond_resched+0x16/0x40<br>
[ 363.353100] ? do_sys_open+0x126/0x210<br>
[ 363.353102] do_sys_open+0x126/0x210<br>
[ 363.353106] do_syscall_64+0x4f/0x100<br>
[ 363.353110] entry_SYSCALL_64_after_hwframe+0x44/0xa9<br>
[ 363.353113] RIP: 0033:0x7f988f340040<br>
[ 363.353113] Code: Bad RIP value.<br>
[ 363.353120] RSP: 002b:00007ffecdefe618 EFLAGS: 00000246 ORIG_RAX: 0000000000000002<br>
[ 363.353123] RAX: ffffffffffffffda RBX: 0000000002337cd0 RCX: 00007f988f340040<br>
[ 363.353124] RDX: 00007ffecdefe67e RSI: 0000000000000002 RDI: 00007ffecdefe670<br>
[ 363.353125] RBP: 00007ffecdefe6a0 R08: 0000000000000000 R09: 000000000000000e<br>
[ 363.353126] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000401b40<br>
[ 363.353127] R13: 00007ffecdefe910 R14: 0000000000000000 R15: 0000000000000000<br>
<br>
<br>
<br>
On 2018-08-09 12:31 PM, James Zhu wrote:<br>
> From: Likun Gao <Likun.Gao@amd.com><br>
><br>
> Setup psp firmware loading for VCN, and make VCN block<br>
> booting from tmr mac address.<br>
><br>
> Signed-off-by: James Zhu <James.Zhu@amd.com><br>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com><br>
> ---<br>
> drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 17 +++++++++------<br>
> drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 38 ++++++++++++++++++++++++++-------<br>
> 2 files changed, 40 insertions(+), 15 deletions(-)<br>
><br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> index 878f62c..77c192a 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> @@ -111,9 +111,10 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)<br>
> version_major, version_minor, family_id);<br>
> }<br>
> <br>
> - bo_size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8)<br>
> - + AMDGPU_VCN_STACK_SIZE + AMDGPU_VCN_HEAP_SIZE<br>
> + bo_size = AMDGPU_VCN_STACK_SIZE + AMDGPU_VCN_HEAP_SIZE<br>
> + AMDGPU_VCN_SESSION_SIZE * 40;<br>
> + if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)<br>
> + bo_size += AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8);<br>
> r = amdgpu_bo_create_kernel(adev, bo_size, PAGE_SIZE,<br>
> AMDGPU_GEM_DOMAIN_VRAM, &adev->vcn.vcpu_bo,<br>
> &adev->vcn.gpu_addr, &adev->vcn.cpu_addr);<br>
> @@ -189,11 +190,13 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)<br>
> unsigned offset;<br>
> <br>
> hdr = (const struct common_firmware_header *)adev->vcn.fw->data;<br>
> - offset = le32_to_cpu(hdr->ucode_array_offset_bytes);<br>
> - memcpy_toio(adev->vcn.cpu_addr, adev->vcn.fw->data + offset,<br>
> - le32_to_cpu(hdr->ucode_size_bytes));<br>
> - size -= le32_to_cpu(hdr->ucode_size_bytes);<br>
> - ptr += le32_to_cpu(hdr->ucode_size_bytes);<br>
> + if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {<br>
> + offset = le32_to_cpu(hdr->ucode_array_offset_bytes);<br>
> + memcpy_toio(adev->vcn.cpu_addr, adev->vcn.fw->data + offset,<br>
> + le32_to_cpu(hdr->ucode_size_bytes));<br>
> + size -= le32_to_cpu(hdr->ucode_size_bytes);<br>
> + ptr += le32_to_cpu(hdr->ucode_size_bytes);<br>
> + }<br>
> memset_io(ptr, 0, size);<br>
> }<br>
> <br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> index 2ce91a7..74c4ef4 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> @@ -100,6 +100,16 @@ static int vcn_v1_0_sw_init(void *handle)<br>
> if (r)<br>
> return r;<br>
> <br>
> + if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {<br>
> + const struct common_firmware_header *hdr;<br>
> + hdr = (const struct common_firmware_header *)adev->vcn.fw->data;<br>
> + adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = AMDGPU_UCODE_ID_VCN;<br>
> + adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;<br>
> + adev->firmware.fw_size +=<br>
> + ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);<br>
> + DRM_INFO("PSP loading VCN firmware\n");<br>
> + }<br>
> +<br>
> r = amdgpu_vcn_resume(adev);<br>
> if (r)<br>
> return r;<br>
> @@ -265,26 +275,38 @@ static int vcn_v1_0_resume(void *handle)<br>
> static void vcn_v1_0_mc_resume(struct amdgpu_device *adev)<br>
> {<br>
> uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw->size + 4);<br>
> -<br>
> - WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
> + uint32_t offset;<br>
> +<br>
> + if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {<br>
> + WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
> + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_lo));<br>
> + WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
> + (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_hi));<br>
> + WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0, 0);<br>
> + offset = 0;<br>
> + } else {<br>
> + WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
> lower_32_bits(adev->vcn.gpu_addr));<br>
> - WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
> + WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
> upper_32_bits(adev->vcn.gpu_addr));<br>
> - WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0,<br>
> + offset = size;<br>
> + WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0,<br>
> AMDGPU_UVD_FIRMWARE_OFFSET >> 3);<br>
> + }<br>
> +<br>
> WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE0, size);<br>
> <br>
> WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW,<br>
> - lower_32_bits(adev->vcn.gpu_addr + size));<br>
> + lower_32_bits(adev->vcn.gpu_addr + offset));<br>
> WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_HIGH,<br>
> - upper_32_bits(adev->vcn.gpu_addr + size));<br>
> + upper_32_bits(adev->vcn.gpu_addr + offset));<br>
> WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET1, 0);<br>
> WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE1, AMDGPU_VCN_HEAP_SIZE);<br>
> <br>
> WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE2_64BIT_BAR_LOW,<br>
> - lower_32_bits(adev->vcn.gpu_addr + size + AMDGPU_VCN_HEAP_SIZE));<br>
> + lower_32_bits(adev->vcn.gpu_addr + offset + AMDGPU_VCN_HEAP_SIZE));<br>
> WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE2_64BIT_BAR_HIGH,<br>
> - upper_32_bits(adev->vcn.gpu_addr + size + AMDGPU_VCN_HEAP_SIZE));<br>
> + upper_32_bits(adev->vcn.gpu_addr + offset + AMDGPU_VCN_HEAP_SIZE));<br>
> WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET2, 0);<br>
> WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE2,<br>
> AMDGPU_VCN_STACK_SIZE + (AMDGPU_VCN_SESSION_SIZE * 40));<br>
<br>
_______________________________________________<br>
amd-gfx mailing list<br>
amd-gfx@lists.freedesktop.org<br>
<a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
</div>
</span></font></div>
</body>
</html>