[PATCH] [RFC] drm/drm_buddy fails to initialize on 32-bit architectures

Thu Mar 9 10:14:50 UTC 2023

[AMD Official Use Only - General]

Hi Luis,

Sorry, I missed this one. Give me some time. I will check on it.

Regards,
Arun
-----Original Message-----
From: Luís Mendes <luis.p.mendes at gmail.com> 
Sent: Thursday, March 9, 2023 3:43 PM
To: Koenig, Christian <Christian.Koenig at amd.com>
Cc: akpm at linux-foundation.org; amd-gfx list <amd-gfx at lists.freedesktop.org>; Linux Kernel Mailing List <linux-kernel at vger.kernel.org>; Paneer Selvam, Arunpravin <Arunpravin.PaneerSelvam at amd.com>
Subject: Re: [PATCH] [RFC] drm/drm_buddy fails to initialize on 32-bit architectures

Hi,

Ping? This is actually a regression.
If there is no one available to work this, maybe I can have a look in my spare time, in accordance with your suggestion.

Regards,
Luís

On Tue, Jan 3, 2023 at 8:44 AM Christian König <christian.koenig at amd.com> wrote:
>
> Am 25.12.22 um 20:39 schrieb Luís Mendes:
> > Re-sending with the correct  linux-kernel mailing list email address.
> > Sorry for the inconvenience.
> >
> > The proposed patch fixes the issue and allows amdgpu to work again 
> > on armhf with a AMD RX 550 card, however it may not be the best 
> > solution for the issue, as detailed below.
> >
> > include/log2.h defined macros rounddown_pow_of_two(...) and
> > roundup_pow_of_two(...) do not handle 64-bit values on 32-bit 
> > architectures (tested on armv9 armhf machine) causing
> > drm_buddy_init(...) to fail on BUG_ON with an underflow on the order 
> > value, thus impeding amdgpu to load properly (no GUI).
> >
> > One option is to modify rounddown_pow_of_two(...) to detect if the 
> > variable takes 32 bits or less and call 
> > __rounddown_pow_of_two_u32(u32
> > n) or if the variable takes more space than 32 bits, then call
> > __rounddown_pow_of_two_u64(u64 n). This would imply renaming 
> > __rounddown_pow_of_two(unsigne d long n) to
> > __rounddown_pow_of_two_u32(u32 n) and add a new function
> > __rounddown_pow_of_two_u64(u64 n). This would be the most 
> > transparent solution, however there a few complications, and they are:
> > - that the mm subsystem will fail to link on armhf with an undefined 
> > reference on __aeabi_uldivmod
> > - there a few drivers that directly call __rounddown_pow_of_two(...)
> > - that other drivers and subsystems generate warnings
> >
> > So this alternate solution was devised which avoids touching 
> > existing code paths, and just updates drm_buddy which seems to be 
> > the only driver that is failing, however I am not sure if this is 
> > the proper way to go. So I would like to get a second opinion on 
> > this, by those who know.
> >
> > /include/linux/log2.h
> > /drivers/gpu/drm/drm_buddy.c
> >
> > Signed-off-by: Luís Mendes <luis.p.mendes at gmail.com>
> >> 8------------------------------------------------------8<
> > diff -uprN linux-next/drivers/gpu/drm/drm_buddy.c
> > linux-nextLM/drivers/gpu/drm/drm_buddy.c
> > --- linux-next/drivers/gpu/drm/drm_buddy.c    2022-12-25
> > 16:29:26.000000000 +0000
> > +++ linux-nextLM/drivers/gpu/drm/drm_buddy.c    2022-12-25
> > 17:04:32.136007116 +0000
> > @@ -128,7 +128,7 @@ int drm_buddy_init(struct drm_buddy *mm,
> >           unsigned int order;
> >           u64 root_size;
> >
> > -        root_size = rounddown_pow_of_two(size);
> > +        root_size = rounddown_pow_of_two_u64(size);
> >           order = ilog2(root_size) - ilog2(chunk_size);
>
> I think this can be handled much easier if keep around the root_order 
> instead of the root_size in the first place.
>
> Cause ilog2() does the right thing even for non power of two values 
> and so we just need the order for the offset subtraction below.
>
> Arun can you take a closer look at this?
>
> Regards,
> Christian.
>
> >
> >           root = drm_block_alloc(mm, NULL, order, offset); diff 
> > -uprN linux-next/include/linux/log2.h linux-nextLM/include/linux/log2.h
> > --- linux-next/include/linux/log2.h    2022-12-25 16:29:29.000000000 +0000
> > +++ linux-nextLM/include/linux/log2.h    2022-12-25 17:00:34.319901492 +0000
> > @@ -58,6 +58,18 @@ unsigned long __roundup_pow_of_two(unsig
> >   }
> >
> >   /**
> > + * __roundup_pow_of_two_u64() - round up to nearest power of two
> > + * (unsgined 64-bits precision version)
> > + * @n: value to round up
> > + */
> > +static inline __attribute__((const))
> > +u64 __roundup_pow_of_two_u64(u64 n) {
> > +    return 1ULL << fls64(n - 1);
> > +}
> > +
> > +
> > +/**
> >    * __rounddown_pow_of_two() - round down to nearest power of two
> >    * @n: value to round down
> >    */
> > @@ -68,6 +80,17 @@ unsigned long __rounddown_pow_of_two(uns
> >   }
> >
> >   /**
> > + * __rounddown_pow_of_two_u64() - round down to nearest power of 
> > +two
> > + * (unsgined 64-bits precision version)
> > + * @n: value to round down
> > + */
> > +static inline __attribute__((const))
> > +u64 __rounddown_pow_of_two_u64(u64 n) {
> > +    return 1ULL << (fls64(n) - 1);
> > +}
> > +
> > +/**
> >    * const_ilog2 - log base 2 of 32-bit or a 64-bit constant unsigned value
> >    * @n: parameter
> >    *
> > @@ -163,6 +186,7 @@ unsigned long __rounddown_pow_of_two(uns
> >       __ilog2_u64(n)            \
> >    )
> >
> > +
> >   /**
> >    * roundup_pow_of_two - round the given value up to nearest power of two
> >    * @n: parameter
> > @@ -181,6 +205,25 @@ unsigned long __rounddown_pow_of_two(uns
> >    )
> >
> >   /**
> > + * roundup_pow_of_two_u64 - round the given value up to nearest 
> > +power of two
> > + * (unsgined 64-bits precision version)
> > + * @n: parameter
> > + *
> > + * round the given value up to the nearest power of two
> > + * - the result is undefined when n == 0
> > + * - this can be used to initialise global variables from constant 
> > +data  */
> > +#define roundup_pow_of_two_u64(n)            \
> > +(                        \
> > +    __builtin_constant_p(n) ? (        \
> > +        ((n) == 1) ? 1 :        \
> > +        (1ULL << (ilog2((n) - 1) + 1))    \
> > +                   ) :        \
> > +    __roundup_pow_of_two_u64(n)            \
> > + )
> > +
> > +
> > +/**
> >    * rounddown_pow_of_two - round the given value down to nearest power of two
> >    * @n: parameter
> >    *
> > @@ -195,6 +238,22 @@ unsigned long __rounddown_pow_of_two(uns
> >       __rounddown_pow_of_two(n)        \
> >    )
> >
> > +/**
> > + * rounddown_pow_of_two_u64 - round the given value down to nearest
> > power of two
> > + * (unsgined 64-bits precision version)
> > + * @n: parameter
> > + *
> > + * round the given value down to the nearest power of two
> > + * - the result is undefined when n == 0
> > + * - this can be used to initialise global variables from constant 
> > +data  */
> > +#define rounddown_pow_of_two_u64(n)            \
> > +(                        \
> > +    __builtin_constant_p(n) ? (        \
> > +        (1ULL << ilog2(n))) :        \
> > +    __rounddown_pow_of_two_u64(n)        \
> > + )
> > +
> >   static inline __attribute_const__
> >   int __order_base_2(unsigned long n)
> >   {
>