[PATCH] drm/amd/display: avoid 64-bit division
Kazlauskas, Nicholas
Nicholas.Kazlauskas at amd.com
Mon Jul 8 14:16:47 UTC 2019
On 7/8/19 9:52 AM, Arnd Bergmann wrote:
> On 32-bit architectures, dividing a 64-bit integer in the kernel
> leads to a link error:
>
> ERROR: "__udivdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
> ERROR: "__divdi3" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
>
> Change the two recently introduced instances to a multiply+shift
> operation that is also much cheaper on 32-bit architectures.
> We can do that here, since both of them are really 32-bit numbers
> that change a few percent.
>
> Fixes: bedbbe6af4be ("drm/amd/display: Move link functions from dc to dc_link")
> Fixes: f18bc4e53ad6 ("drm/amd/display: update calculated bounding box logic for NV")
> Signed-off-by: Arnd Bergmann <arnd at arndb.de>
> ---
> drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 ++--
> drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> index c17db5c144aa..8dbf759eba45 100644
> --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
> @@ -3072,8 +3072,8 @@ uint32_t dc_link_bandwidth_kbps(
> * but the difference is minimal and is in a safe direction,
> * which all works well around potential ambiguity of DP 1.4a spec.
> */
> - long long fec_link_bw_kbps = link_bw_kbps * 970LL;
> - link_bw_kbps = (uint32_t)(fec_link_bw_kbps / 1000LL);
> + link_bw_kbps = mul_u64_u32_shr(BIT_ULL(32) * 970LL / 1000,
> + link_bw_kbps, 32);
> }
> #endif
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> index b35327bafbc5..70ac8a95d2db 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
> @@ -2657,7 +2657,7 @@ static void update_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding_box_
> calculated_states[i].dram_speed_mts = uclk_states[i] * 16 / 1000;
>
> // FCLK:UCLK ratio is 1.08
> - min_fclk_required_by_uclk = ((unsigned long long)uclk_states[i]) * 1080 / 1000000;
> + min_fclk_required_by_uclk = mul_u64_u32_shr(BIT_ULL(32) * 1080 / 1000000, uclk_states[i], 32);
Even though the mul + shift will be faster here, I would prefer that
this just be a div_u64 for clarity.
Nicholas Kazlauskas
>
> calculated_states[i].fabricclk_mhz = (min_fclk_required_by_uclk < min_dcfclk) ?
> min_dcfclk : min_fclk_required_by_uclk;
>
More information about the amd-gfx
mailing list