[PATCH] drm/panic: Add a u64 divide by 10 for arm32
Jocelyn Falempe
jfalempe at redhat.com
Fri Jun 27 11:47:09 UTC 2025
On 27/06/2025 13:36, Alice Ryhl wrote:
> On Fri, Jun 27, 2025 at 11:41 AM Jocelyn Falempe <jfalempe at redhat.com> wrote:
>>
>> On 32bits ARM, u64 divided by a constant is not optimized to a
>> multiply by inverse by the compiler [1].
>> So do the multiply by inverse explicitly for this architecture.
>>
>> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
>> Reported-by: Andrei Lalaev <andrey.lalaev at gmail.com>
>> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
>> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
>> Signed-off-by: Jocelyn Falempe <jfalempe at redhat.com>
>
> Not to block this change, but I think this really ought to be fixed in
> the compiler. We should not have to do this kind of thing to divide by
> 10.
I agree, I didn't expect that would be a problem. But I'm not a compiler
expert, and it will probably take time to update the compiler, so we
have to do this at least temporary.
>
>> drivers/gpu/drm/drm_panic_qr.rs | 24 +++++++++++++++++++++++-
>> 1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
>> index dd55b1cb764d..82acecd505d3 100644
>> --- a/drivers/gpu/drm/drm_panic_qr.rs
>> +++ b/drivers/gpu/drm/drm_panic_qr.rs
>> @@ -381,6 +381,24 @@ struct DecFifo {
>> len: usize,
>> }
>>
>> +/// On arm32 architecture, dividing an u64 by a constant will generate a call
>> +/// to __aeabi_uldivmod which is not present in the kernel.
>> +/// So use the multiply by inverse method for this architecture.
>> +#[cfg(target_arch = "arm")]
>> +fn div10(val: u64) -> u64
>> +{
>
> Please run rustfmt on your patch.
sorry, I will fix that.
>
>> + let val_h = val >> 32;
>> + let val_l = val & 0xFFFFFFFF;
>> + let b_h: u64 = 0x66666666;
>> + let b_l: u64 = 0x66666667;
>> +
>> + let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
>> + let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
>> + let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
>> +
>> + tmp3 >> 2
>> +}
>> +
>> impl DecFifo {
>> fn push(&mut self, data: u64, len: usize) {
>> let mut chunk = data;
>> @@ -389,7 +407,11 @@ fn push(&mut self, data: u64, len: usize) {
>> }
>> for i in 0..len {
>> self.decimals[i] = (chunk % 10) as u8;
>> - chunk /= 10;
>> + if cfg!(target_arch = "arm") {
>> + chunk = div10(chunk);
>> + } else {
>> + chunk /= 10;
>> + }
>
> I would get rid of this conditional and declare another div10 function
> that just does input/10 on other arches.
ok, I will send a v2 shortly with that changed.
>
> Alice
>
More information about the dri-devel
mailing list