[PATCH 1/2] drm/amdkfd: Fix GWS queue count

Felix Kuehling felix.kuehling at amd.com
Thu Apr 14 14:52:01 UTC 2022


Am 2022-04-13 um 11:51 schrieb David Yat Sin:
> Queue can be inactive during process termination. This would cause
> dqm->gws_queue_count to not be decremented. There can only be 1 GWS
> queue per device process so moving the logic out of loop.
>
> Signed-off-by: David Yat Sin <david.yatsin at amd.com>
> ---
>   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c    | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index acf4f7975850..7c107b88d944 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1945,17 +1945,17 @@ static int process_termination_cpsch(struct device_queue_manager *dqm,
>   		else if (q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI)
>   			deallocate_sdma_queue(dqm, q);
>   
> -		if (q->properties.is_active) {
> +		if (q->properties.is_active)
>   			decrement_queue_count(dqm, q->properties.type);
> -			if (q->properties.is_gws) {
> -				dqm->gws_queue_count--;
> -				qpd->mapped_gws_queue = false;
> -			}
> -		}

This looks like the original intention was to decrement the counter for 
inactive GWS queues. This makes sense because this counter is used to 
determine whether the runlist is oversubscribed. Inactive queues are not 
in the runlist, so they should not be counted.

I see that the counter is updated when queues are activated and 
deactivated in update_queue. So IMO this patch is both incorrect and 
unnecessary. Did you see an actual problem with the GWS counter during 
process termination? If so, I'd like to know more to understand the root 
cause.

Regards,
   Felix


>   
>   		dqm->total_queue_count--;
>   	}
>   
> +	if (qpd->mapped_gws_queue) {
> +		dqm->gws_queue_count--;
> +		qpd->mapped_gws_queue = false;
> +	}
> +
>   	/* Unregister process */
>   	list_for_each_entry_safe(cur, next_dpn, &dqm->queues, list) {
>   		if (qpd == cur->qpd) {


More information about the amd-gfx mailing list