[PATCH] drm/amdkfd: queue kfd interrupt work to different CPU

Eric Huang jinhuieric.huang at amd.com
Thu Dec 12 15:07:35 UTC 2019


It fixes cpu stuck issue in some extreme test cases.

Reviewed-by: Eric Huang <JinhuiEric.Huang at amd.com>

On 2019-12-12 9:51 a.m., Philip Yang wrote:
> Because queue_work schedule the work on the same CPU the interrupt
> handler is running, if there are many interrupts pending, it takes
> longer time for work queue to start, or even worse system will hang.
>
> Signed-off-by: Philip Yang <Philip.Yang at amd.com>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 209bfc849352..ee2a9bb1cb07 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -844,7 +844,8 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
>   				   patched_ihre, &is_patched)
>   	    && enqueue_ih_ring_entry(kfd,
>   				     is_patched ? patched_ihre : ih_ring_entry))
> -		queue_work(kfd->ih_wq, &kfd->interrupt_work);
> +		queue_work_on((smp_processor_id() + 1) % num_online_cpus(),
> +			       kfd->ih_wq, &kfd->interrupt_work);
>   
>   	spin_unlock_irqrestore(&kfd->interrupt_lock, flags);
>   }



More information about the amd-gfx mailing list