[PATCH] drm/amdkfd: Fix unaligned 64-bit doorbell warning
Felix Kuehling
felix.kuehling at amd.com
Wed Aug 30 15:40:26 UTC 2023
+Shashank, FYI. I believe this is a regression from your patch
"drm/amdgpu: use doorbell mgr for kfd kernel doorbells".
On 2023-08-29 12:16, Mukul Joshi wrote:
> This patch fixes the following unaligned 64-bit doorbell
> warning seen when submitting packets on HIQ on GFX v9.4.3
> by making the HIQ doorbell 64-bit aligned.
> The warning is seen when GPU is loaded in any mode other
> than SPX mode.
>
> [ +0.000301] ------------[ cut here ]------------
> [ +0.000003] Unaligned 64-bit doorbell
> [ +0.000030] WARNING: /amdkfd/kfd_doorbell.c:339 write_kernel_doorbell64+0x72/0x80 [amdgpu]
> [ +0.000003] RIP: 0010:write_kernel_doorbell64+0x72/0x80 [amdgpu]
> [ +0.000004] RSP: 0018:ffffc90004287730 EFLAGS: 00010246
> [ +0.000005] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ +0.000003] RDX: 0000000000000001 RSI: ffffffff82837c71 RDI: 00000000ffffffff
> [ +0.000003] RBP: ffffc90004287748 R08: 0000000000000003 R09: 0000000000000001
> [ +0.000002] R10: 000000000000001a R11: ffff88a034008198 R12: ffffc900013bd004
> [ +0.000003] R13: 0000000000000008 R14: ffffc900042877b0 R15: 000000000000007f
> [ +0.000003] FS: 00007fa8c7b62000(0000) GS:ffff889f88400000(0000) knlGS:0000000000000000
> [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ +0.000003] CR2: 000056111c45aaf0 CR3: 00000001414f2002 CR4: 0000000000770ee0
> [ +0.000003] PKRU: 55555554
> [ +0.000002] Call Trace:
> [ +0.000004] <TASK>
> [ +0.000006] kq_submit_packet+0x45/0x50 [amdgpu]
> [ +0.000524] pm_send_set_resources+0x7f/0xc0 [amdgpu]
> [ +0.000500] set_sched_resources+0xe4/0x160 [amdgpu]
> [ +0.000503] start_cpsch+0x1c5/0x2a0 [amdgpu]
> [ +0.000497] kgd2kfd_device_init.cold+0x816/0xb42 [amdgpu]
> [ +0.000743] amdgpu_amdkfd_device_init+0x15f/0x1f0 [amdgpu]
> [ +0.000602] amdgpu_device_init.cold+0x1813/0x2176 [amdgpu]
> [ +0.000684] ? pci_bus_read_config_word+0x4a/0x80
> [ +0.000012] ? do_pci_enable_device+0xdc/0x110
> [ +0.000008] amdgpu_driver_load_kms+0x1a/0x110 [amdgpu]
> [ +0.000545] amdgpu_pci_probe+0x197/0x400 [amdgpu]
>
> Signed-off-by: Mukul Joshi <mukul.joshi at amd.com>
This should have a Fixes tag:
Fixes: cfeaeb3c0ce7 ("drm/amdgpu: use doorbell mgr for kfd kernel
doorbells")
The original code before that patch used "* sizeof(u32) /
kfd->device_info.doorbell_size" instead of "* 2". May be safer to
restore the original calculation to have the correct doorbell size on
old and new GPUs.
Regards,
Felix
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index c2e0b79dcc6d..b1c2772c3a8d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -168,7 +168,7 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
> " doorbell index == 0x%x\n",
> *doorbell_off, inx);
>
> - return kfd->doorbell_kernel_ptr + inx;
> + return kfd->doorbell_kernel_ptr + inx * 2;
> }
>
> void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr)
> @@ -176,6 +176,7 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr)
> unsigned int inx;
>
> inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
> + inx /= 2;
>
> mutex_lock(&kfd->doorbell_mutex);
> __clear_bit(inx, kfd->doorbell_bitmap);
More information about the amd-gfx
mailing list