[PATCH] [RFC] drm/drm_buddy fails to initialize on 32-bit architectures

Christian König christian.koenig at amd.com
Tue Jan 3 08:44:17 UTC 2023


Am 25.12.22 um 20:39 schrieb Luís Mendes:
> Re-sending with the correct  linux-kernel mailing list email address.
> Sorry for the inconvenience.
>
> The proposed patch fixes the issue and allows amdgpu to work again on
> armhf with a AMD RX 550 card, however it may not be the best solution
> for the issue, as detailed below.
>
> include/log2.h defined macros rounddown_pow_of_two(...) and
> roundup_pow_of_two(...) do not handle 64-bit values on 32-bit
> architectures (tested on armv9 armhf machine) causing
> drm_buddy_init(...) to fail on BUG_ON with an underflow on the order
> value, thus impeding amdgpu to load properly (no GUI).
>
> One option is to modify rounddown_pow_of_two(...) to detect if the
> variable takes 32 bits or less and call __rounddown_pow_of_two_u32(u32
> n) or if the variable takes more space than 32 bits, then call
> __rounddown_pow_of_two_u64(u64 n). This would imply renaming
> __rounddown_pow_of_two(unsigne
> d long n) to
> __rounddown_pow_of_two_u32(u32 n) and add a new function
> __rounddown_pow_of_two_u64(u64 n). This would be the most transparent
> solution, however there a few complications, and they are:
> - that the mm subsystem will fail to link on armhf with an undefined
> reference on __aeabi_uldivmod
> - there a few drivers that directly call __rounddown_pow_of_two(...)
> - that other drivers and subsystems generate warnings
>
> So this alternate solution was devised which avoids touching existing
> code paths, and just updates drm_buddy which seems to be the only
> driver that is failing, however I am not sure if this is the proper
> way to go. So I would like to get a second opinion on this, by those
> who know.
>
> /include/linux/log2.h
> /drivers/gpu/drm/drm_buddy.c
>
> Signed-off-by: Luís Mendes <luis.p.mendes at gmail.com>
>> 8------------------------------------------------------8<
> diff -uprN linux-next/drivers/gpu/drm/drm_buddy.c
> linux-nextLM/drivers/gpu/drm/drm_buddy.c
> --- linux-next/drivers/gpu/drm/drm_buddy.c    2022-12-25
> 16:29:26.000000000 +0000
> +++ linux-nextLM/drivers/gpu/drm/drm_buddy.c    2022-12-25
> 17:04:32.136007116 +0000
> @@ -128,7 +128,7 @@ int drm_buddy_init(struct drm_buddy *mm,
>           unsigned int order;
>           u64 root_size;
>
> -        root_size = rounddown_pow_of_two(size);
> +        root_size = rounddown_pow_of_two_u64(size);
>           order = ilog2(root_size) - ilog2(chunk_size);

I think this can be handled much easier if keep around the root_order 
instead of the root_size in the first place.

Cause ilog2() does the right thing even for non power of two values and 
so we just need the order for the offset subtraction below.

Arun can you take a closer look at this?

Regards,
Christian.

>
>           root = drm_block_alloc(mm, NULL, order, offset);
> diff -uprN linux-next/include/linux/log2.h linux-nextLM/include/linux/log2.h
> --- linux-next/include/linux/log2.h    2022-12-25 16:29:29.000000000 +0000
> +++ linux-nextLM/include/linux/log2.h    2022-12-25 17:00:34.319901492 +0000
> @@ -58,6 +58,18 @@ unsigned long __roundup_pow_of_two(unsig
>   }
>
>   /**
> + * __roundup_pow_of_two_u64() - round up to nearest power of two
> + * (unsgined 64-bits precision version)
> + * @n: value to round up
> + */
> +static inline __attribute__((const))
> +u64 __roundup_pow_of_two_u64(u64 n)
> +{
> +    return 1ULL << fls64(n - 1);
> +}
> +
> +
> +/**
>    * __rounddown_pow_of_two() - round down to nearest power of two
>    * @n: value to round down
>    */
> @@ -68,6 +80,17 @@ unsigned long __rounddown_pow_of_two(uns
>   }
>
>   /**
> + * __rounddown_pow_of_two_u64() - round down to nearest power of two
> + * (unsgined 64-bits precision version)
> + * @n: value to round down
> + */
> +static inline __attribute__((const))
> +u64 __rounddown_pow_of_two_u64(u64 n)
> +{
> +    return 1ULL << (fls64(n) - 1);
> +}
> +
> +/**
>    * const_ilog2 - log base 2 of 32-bit or a 64-bit constant unsigned value
>    * @n: parameter
>    *
> @@ -163,6 +186,7 @@ unsigned long __rounddown_pow_of_two(uns
>       __ilog2_u64(n)            \
>    )
>
> +
>   /**
>    * roundup_pow_of_two - round the given value up to nearest power of two
>    * @n: parameter
> @@ -181,6 +205,25 @@ unsigned long __rounddown_pow_of_two(uns
>    )
>
>   /**
> + * roundup_pow_of_two_u64 - round the given value up to nearest power of two
> + * (unsgined 64-bits precision version)
> + * @n: parameter
> + *
> + * round the given value up to the nearest power of two
> + * - the result is undefined when n == 0
> + * - this can be used to initialise global variables from constant data
> + */
> +#define roundup_pow_of_two_u64(n)            \
> +(                        \
> +    __builtin_constant_p(n) ? (        \
> +        ((n) == 1) ? 1 :        \
> +        (1ULL << (ilog2((n) - 1) + 1))    \
> +                   ) :        \
> +    __roundup_pow_of_two_u64(n)            \
> + )
> +
> +
> +/**
>    * rounddown_pow_of_two - round the given value down to nearest power of two
>    * @n: parameter
>    *
> @@ -195,6 +238,22 @@ unsigned long __rounddown_pow_of_two(uns
>       __rounddown_pow_of_two(n)        \
>    )
>
> +/**
> + * rounddown_pow_of_two_u64 - round the given value down to nearest
> power of two
> + * (unsgined 64-bits precision version)
> + * @n: parameter
> + *
> + * round the given value down to the nearest power of two
> + * - the result is undefined when n == 0
> + * - this can be used to initialise global variables from constant data
> + */
> +#define rounddown_pow_of_two_u64(n)            \
> +(                        \
> +    __builtin_constant_p(n) ? (        \
> +        (1ULL << ilog2(n))) :        \
> +    __rounddown_pow_of_two_u64(n)        \
> + )
> +
>   static inline __attribute_const__
>   int __order_base_2(unsigned long n)
>   {



More information about the amd-gfx mailing list