[PATCH] drm/amdkfd: potential crash in kfd_create_indirect_link_prop()
Dan Carpenter
dan.carpenter at oracle.com
Mon Aug 15 11:23:59 UTC 2022
On Fri, Aug 12, 2022 at 05:10:51PM -0400, Felix Kuehling wrote:
> On 2022-08-12 02:20, Dan Carpenter wrote:
> > This code has two bugs. If kfd_topology_device_by_proximity_domain()
> > failed on the first iteration through the loop then "cpu_link" is
> > uninitialized and should not be dereferenced.
> >
> > The second bug is that we cannot dereference a list iterator when it
> > points to the list head. In other words, if we exit the
> > list_for_each_entry() loop exits without hitting a break then "cpu_link"
> > is not a valid pointer and should not be dereferenced.
> >
> > Fix both of these problems by setting "cpu_link" to NULL when it is invalid
> > and non-NULL when it is valid. That makes it easier to test for
> > valid vs invalid.
> >
> > Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links")
> > Signed-off-by: Dan Carpenter <dan.carpenter at oracle.com>
> > ---
> > I reported these in June but never heard back.
>
> I thought Ramesh implemented a fix for this:
> https://lore.kernel.org/all/20220706183302.1719795-1-Ramesh.Errabolu@amd.com/
>
> You commented on a version of his patch:
> https://lore.kernel.org/all/20220629161241.GM11460@kadam/
Oh, Sorry! I appologize for forgetting that.
>
> Did this get lost somehow? Anyway, your patch looks good to me and I'm going
> to apply it to amd-staging-drm-next now.
>
> Reviewed-by: Felix Kuehling <Felix.Kuehling at amd.com>
Looking at Ramesh's patch now, he added a continue if
kfd_topology_device_by_proximity_domain() failed. That is a behavior
change, but it might also be a bug fix.
My patch does not change the behavior except for eliminating the crash
so I stand by my patch, but adding the continue might be a good thing to
do as a separate step. I don't know the code well enough to say one way
or the other.
regards,
dan carpenter
More information about the amd-gfx
mailing list