[PATCH] drm/amdkfd: potential crash in kfd_create_indirect_link_prop()
Felix Kuehling
felix.kuehling at amd.com
Fri Aug 12 21:10:51 UTC 2022
On 2022-08-12 02:20, Dan Carpenter wrote:
> This code has two bugs. If kfd_topology_device_by_proximity_domain()
> failed on the first iteration through the loop then "cpu_link" is
> uninitialized and should not be dereferenced.
>
> The second bug is that we cannot dereference a list iterator when it
> points to the list head. In other words, if we exit the
> list_for_each_entry() loop exits without hitting a break then "cpu_link"
> is not a valid pointer and should not be dereferenced.
>
> Fix both of these problems by setting "cpu_link" to NULL when it is invalid
> and non-NULL when it is valid. That makes it easier to test for
> valid vs invalid.
>
> Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links")
> Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
> ---
> I reported these in June but never heard back.
I thought Ramesh implemented a fix for this:
https://lore.kernel.org/all/20220706183302.1719795-1-Ramesh.Errabolu@amd.com/
You commented on a version of his patch:
https://lore.kernel.org/all/20220629161241.GM11460@kadam/
Did this get lost somehow? Anyway, your patch looks good to me and I'm
going to apply it to amd-staging-drm-next now.
Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
Thanks,
Felix
>
> drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 25990bec600d..3f0a4a415907 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1392,8 +1392,8 @@ static int kfd_build_p2p_node_entry(struct kfd_topology_device *dev,
>
> static int kfd_create_indirect_link_prop(struct kfd_topology_device *kdev, int gpu_node)
> {
> + struct kfd_iolink_properties *gpu_link, *tmp_link, *cpu_link;
> struct kfd_iolink_properties *props = NULL, *props2 = NULL;
> - struct kfd_iolink_properties *gpu_link, *cpu_link;
> struct kfd_topology_device *cpu_dev;
> int ret = 0;
> int i, num_cpu;
> @@ -1416,16 +1416,19 @@ static int kfd_create_indirect_link_prop(struct kfd_topology_device *kdev, int g
> continue;
>
> /* find CPU <--> CPU links */
> + cpu_link = NULL;
> cpu_dev = kfd_topology_device_by_proximity_domain(i);
> if (cpu_dev) {
> - list_for_each_entry(cpu_link,
> + list_for_each_entry(tmp_link,
> &cpu_dev->io_link_props, list) {
> - if (cpu_link->node_to == gpu_link->node_to)
> + if (tmp_link->node_to == gpu_link->node_to) {
> + cpu_link = tmp_link;
> break;
> + }
> }
> }
>
> - if (cpu_link->node_to != gpu_link->node_to)
> + if (!cpu_link)
> return -ENOMEM;
>
> /* CPU <--> CPU <--> GPU, GPU node*/
More information about the amd-gfx
mailing list