<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Calibri,Helvetica,sans-serif;" dir="ltr">
<p style="margin-top:0;margin-bottom:0">Hi Felix,</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">We did test on both China team and Makham team. Also Embedded team did the test also on release 18.20 for Raven.</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<p style="margin-top:0;margin-bottom:0">Please let <font size="2"><span style="font-size:11pt;">ROCm CQE</span></font> team issue a JIRA ticket and the detail reproduce step.<br>
</p>
<p style="margin-top:0;margin-bottom:0"><br>
</p>
<div id="Signature">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Calibri, Arial, Helvetica, sans-serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Thanks & Best Regards!</p>
<p><br>
</p>
<p>James Zhu<br>
</p>
</div>
</div>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <felix.kuehling@amd.com><br>
<b>Sent:</b> Friday, August 17, 2018 7:25:53 PM<br>
<b>To:</b> James Zhu; amd-gfx@lists.freedesktop.org<br>
<b>Cc:</b> Deucher, Alexander; Gao, Likun; Zhu, James; Huang, Ray<br>
<b>Subject:</b> Re: [PATCH v2 5/5] drm/amdgpu:add VCN booting with firmware loaded by PSP</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">ROCm CQE is seeing what looks like hangs during amdgpu initialization on<br>
Raven and Vega20. Amdgpu basically stops printing messages while trying<br>
to load VCN firmware. It never completes initialization, but there is no<br>
obvious error message. These are the last messages from amdgpu in the log:<br>
<br>
[    1.282661] [drm] Found VCN firmware Version: 1.24 Family ID: 18<br>
[    1.282664] [drm] PSP loading VCN firmware<br>
[    1.303164] [drm] reserve 0x400000 from 0xf400e00000 for PSP TMR SIZE<br>
<br>
Any applications trying to use /dev/dri/* hang with a backtrace like below.<br>
<br>
Was this change expected to affect Raven and Vega20? Has it been tested<br>
before submitting? Do we need updated VCN firmware for it to work?<br>
<br>
Thanks,<br>
  Felix<br>
<br>
[  363.352985] INFO: task gpu-manager:937 blocked for more than 120 seconds.<br>
[  363.352995]       Not tainted 4.18.0-rc1-kfd-compute-roc-master-8912 #1<br>
[  363.352999] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
[  363.353004] gpu-manager     D    0   937      1 0x00000000<br>
[  363.353008] Call Trace:<br>
[  363.353018]  ? __schedule+0x3d9/0x8b0<br>
[  363.353023]  schedule+0x32/0x80<br>
[  363.353026]  schedule_preempt_disabled+0xa/0x10<br>
[  363.353028]  __mutex_lock.isra.4+0x2ae/0x4e0<br>
[  363.353031]  ? _cond_resched+0x16/0x40<br>
[  363.353048]  ? drm_stub_open+0x2e/0x100 [drm]<br>
[  363.353063]  drm_stub_open+0x2e/0x100 [drm]<br>
[  363.353069]  chrdev_open+0xbe/0x1a0<br>
[  363.353072]  ? cdev_put+0x20/0x20<br>
[  363.353075]  do_dentry_open+0x1e2/0x300<br>
[  363.353078]  path_openat+0x2b4/0x14b0<br>
[  363.353082]  ? vsnprintf+0x230/0x4c0<br>
[  363.353086]  ? __alloc_pages_nodemask+0x100/0x290<br>
[  363.353088]  do_filp_open+0x99/0x110<br>
[  363.353092]  ? generic_update_time+0x6a/0xc0<br>
[  363.353094]  ? touch_atime+0xc1/0xd0<br>
[  363.353096]  ? _cond_resched+0x16/0x40<br>
[  363.353100]  ? do_sys_open+0x126/0x210<br>
[  363.353102]  do_sys_open+0x126/0x210<br>
[  363.353106]  do_syscall_64+0x4f/0x100<br>
[  363.353110]  entry_SYSCALL_64_after_hwframe+0x44/0xa9<br>
[  363.353113] RIP: 0033:0x7f988f340040<br>
[  363.353113] Code: Bad RIP value.<br>
[  363.353120] RSP: 002b:00007ffecdefe618 EFLAGS: 00000246 ORIG_RAX: 0000000000000002<br>
[  363.353123] RAX: ffffffffffffffda RBX: 0000000002337cd0 RCX: 00007f988f340040<br>
[  363.353124] RDX: 00007ffecdefe67e RSI: 0000000000000002 RDI: 00007ffecdefe670<br>
[  363.353125] RBP: 00007ffecdefe6a0 R08: 0000000000000000 R09: 000000000000000e<br>
[  363.353126] R10: 000000000000069d R11: 0000000000000246 R12: 0000000000401b40<br>
[  363.353127] R13: 00007ffecdefe910 R14: 0000000000000000 R15: 0000000000000000<br>
<br>
<br>
<br>
On 2018-08-09 12:31 PM, James Zhu wrote:<br>
> From: Likun Gao <Likun.Gao@amd.com><br>
><br>
> Setup psp firmware loading for VCN, and make VCN block<br>
> booting from tmr mac address.<br>
><br>
> Signed-off-by: James Zhu <James.Zhu@amd.com><br>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com><br>
> ---<br>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 17 +++++++++------<br>
>  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c   | 38 ++++++++++++++++++++++++++-------<br>
>  2 files changed, 40 insertions(+), 15 deletions(-)<br>
><br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> index 878f62c..77c192a 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c<br>
> @@ -111,9 +111,10 @@ int amdgpu_vcn_sw_init(struct amdgpu_device *adev)<br>
>                        version_major, version_minor, family_id);<br>
>        }<br>
>  <br>
> -     bo_size = AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8)<br>
> -               +  AMDGPU_VCN_STACK_SIZE + AMDGPU_VCN_HEAP_SIZE<br>
> +     bo_size = AMDGPU_VCN_STACK_SIZE + AMDGPU_VCN_HEAP_SIZE<br>
>                  +  AMDGPU_VCN_SESSION_SIZE * 40;<br>
> +     if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP)<br>
> +             bo_size += AMDGPU_GPU_PAGE_ALIGN(le32_to_cpu(hdr->ucode_size_bytes) + 8);<br>
>        r = amdgpu_bo_create_kernel(adev, bo_size, PAGE_SIZE,<br>
>                                    AMDGPU_GEM_DOMAIN_VRAM, &adev->vcn.vcpu_bo,<br>
>                                    &adev->vcn.gpu_addr, &adev->vcn.cpu_addr);<br>
> @@ -189,11 +190,13 @@ int amdgpu_vcn_resume(struct amdgpu_device *adev)<br>
>                unsigned offset;<br>
>  <br>
>                hdr = (const struct common_firmware_header *)adev->vcn.fw->data;<br>
> -             offset = le32_to_cpu(hdr->ucode_array_offset_bytes);<br>
> -             memcpy_toio(adev->vcn.cpu_addr, adev->vcn.fw->data + offset,<br>
> -                         le32_to_cpu(hdr->ucode_size_bytes));<br>
> -             size -= le32_to_cpu(hdr->ucode_size_bytes);<br>
> -             ptr += le32_to_cpu(hdr->ucode_size_bytes);<br>
> +             if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {<br>
> +                     offset = le32_to_cpu(hdr->ucode_array_offset_bytes);<br>
> +                     memcpy_toio(adev->vcn.cpu_addr, adev->vcn.fw->data + offset,<br>
> +                                 le32_to_cpu(hdr->ucode_size_bytes));<br>
> +                     size -= le32_to_cpu(hdr->ucode_size_bytes);<br>
> +                     ptr += le32_to_cpu(hdr->ucode_size_bytes);<br>
> +             }<br>
>                memset_io(ptr, 0, size);<br>
>        }<br>
>  <br>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> index 2ce91a7..74c4ef4 100644<br>
> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c<br>
> @@ -100,6 +100,16 @@ static int vcn_v1_0_sw_init(void *handle)<br>
>        if (r)<br>
>                return r;<br>
>  <br>
> +     if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {<br>
> +             const struct common_firmware_header *hdr;<br>
> +             hdr = (const struct common_firmware_header *)adev->vcn.fw->data;<br>
> +             adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].ucode_id = AMDGPU_UCODE_ID_VCN;<br>
> +             adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].fw = adev->vcn.fw;<br>
> +             adev->firmware.fw_size +=<br>
> +                     ALIGN(le32_to_cpu(hdr->ucode_size_bytes), PAGE_SIZE);<br>
> +             DRM_INFO("PSP loading VCN firmware\n");<br>
> +     }<br>
> +<br>
>        r = amdgpu_vcn_resume(adev);<br>
>        if (r)<br>
>                return r;<br>
> @@ -265,26 +275,38 @@ static int vcn_v1_0_resume(void *handle)<br>
>  static void vcn_v1_0_mc_resume(struct amdgpu_device *adev)<br>
>  {<br>
>        uint32_t size = AMDGPU_GPU_PAGE_ALIGN(adev->vcn.fw->size + 4);<br>
> -<br>
> -     WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
> +     uint32_t offset;<br>
> +<br>
> +     if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) {<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
> +                     (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_lo));<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
> +                     (adev->firmware.ucode[AMDGPU_UCODE_ID_VCN].tmr_mc_addr_hi));<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0, 0);<br>
> +             offset = 0;<br>
> +     } else {<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_LOW,<br>
>                        lower_32_bits(adev->vcn.gpu_addr));<br>
> -     WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE_64BIT_BAR_HIGH,<br>
>                        upper_32_bits(adev->vcn.gpu_addr));<br>
> -     WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0,<br>
> +             offset = size;<br>
> +             WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET0,<br>
>                                AMDGPU_UVD_FIRMWARE_OFFSET >> 3);<br>
> +     }<br>
> +<br>
>        WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE0, size);<br>
>  <br>
>        WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_LOW,<br>
> -                     lower_32_bits(adev->vcn.gpu_addr + size));<br>
> +                     lower_32_bits(adev->vcn.gpu_addr + offset));<br>
>        WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE1_64BIT_BAR_HIGH,<br>
> -                     upper_32_bits(adev->vcn.gpu_addr + size));<br>
> +                     upper_32_bits(adev->vcn.gpu_addr + offset));<br>
>        WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET1, 0);<br>
>        WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE1, AMDGPU_VCN_HEAP_SIZE);<br>
>  <br>
>        WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE2_64BIT_BAR_LOW,<br>
> -                     lower_32_bits(adev->vcn.gpu_addr + size + AMDGPU_VCN_HEAP_SIZE));<br>
> +                     lower_32_bits(adev->vcn.gpu_addr + offset + AMDGPU_VCN_HEAP_SIZE));<br>
>        WREG32_SOC15(UVD, 0, mmUVD_LMI_VCPU_CACHE2_64BIT_BAR_HIGH,<br>
> -                     upper_32_bits(adev->vcn.gpu_addr + size + AMDGPU_VCN_HEAP_SIZE));<br>
> +                     upper_32_bits(adev->vcn.gpu_addr + offset + AMDGPU_VCN_HEAP_SIZE));<br>
>        WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_OFFSET2, 0);<br>
>        WREG32_SOC15(UVD, 0, mmUVD_VCPU_CACHE_SIZE2,<br>
>                        AMDGPU_VCN_STACK_SIZE + (AMDGPU_VCN_SESSION_SIZE * 40));<br>
<br>
_______________________________________________<br>
amd-gfx mailing list<br>
amd-gfx@lists.freedesktop.org<br>
<a href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a><br>
</div>
</span></font></div>
</body>
</html>