[PATCH] drm/amdkfd: potential crash in kfd_create_indirect_link_prop()

Felix Kuehling felix.kuehling at amd.com
Fri Aug 12 21:10:51 UTC 2022


On 2022-08-12 02:20, Dan Carpenter wrote:
> This code has two bugs.  If kfd_topology_device_by_proximity_domain()
> failed on the first iteration through the loop then "cpu_link" is
> uninitialized and should not be dereferenced.
>
> The second bug is that we cannot dereference a list iterator when it
> points to the list head.  In other words, if we exit the
> list_for_each_entry() loop exits without hitting a break then "cpu_link"
> is not a valid pointer and should not be dereferenced.
>
> Fix both of these problems by setting "cpu_link" to NULL when it is invalid
> and non-NULL when it is valid.  That makes it easier to test for
> valid vs invalid.
>
> Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links")
> Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
> ---
> I reported these in June but never heard back.

I thought Ramesh implemented a fix for this: 
https://lore.kernel.org/all/20220706183302.1719795-1-Ramesh.Errabolu@amd.com/

You commented on a version of his patch: 
https://lore.kernel.org/all/20220629161241.GM11460@kadam/

Did this get lost somehow? Anyway, your patch looks good to me and I'm 
going to apply it to amd-staging-drm-next now.

Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>

Thanks,
   Felix


>
>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 25990bec600d..3f0a4a415907 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1392,8 +1392,8 @@ static int kfd_build_p2p_node_entry(struct kfd_topology_device *dev,
>   
>   static int kfd_create_indirect_link_prop(struct kfd_topology_device *kdev, int gpu_node)
>   {
> +	struct kfd_iolink_properties *gpu_link, *tmp_link, *cpu_link;
>   	struct kfd_iolink_properties *props = NULL, *props2 = NULL;
> -	struct kfd_iolink_properties *gpu_link, *cpu_link;
>   	struct kfd_topology_device *cpu_dev;
>   	int ret = 0;
>   	int i, num_cpu;
> @@ -1416,16 +1416,19 @@ static int kfd_create_indirect_link_prop(struct kfd_topology_device *kdev, int g
>   			continue;
>   
>   		/* find CPU <-->  CPU links */
> +		cpu_link = NULL;
>   		cpu_dev = kfd_topology_device_by_proximity_domain(i);
>   		if (cpu_dev) {
> -			list_for_each_entry(cpu_link,
> +			list_for_each_entry(tmp_link,
>   					&cpu_dev->io_link_props, list) {
> -				if (cpu_link->node_to == gpu_link->node_to)
> +				if (tmp_link->node_to == gpu_link->node_to) {
> +					cpu_link = tmp_link;
>   					break;
> +				}
>   			}
>   		}
>   
> -		if (cpu_link->node_to != gpu_link->node_to)
> +		if (!cpu_link)
>   			return -ENOMEM;
>   
>   		/* CPU <--> CPU <--> GPU, GPU node*/


More information about the amd-gfx mailing list