<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<p style="font-family:Arial;font-size:10pt;color:#317100;margin:15pt;" align="Left">
[AMD Public Use]<br>
</p>
<br>
<div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
I've gone ahead and dropped the patch.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Alex<br>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Koenig, Christian <Christian.Koenig@amd.com><br>
<b>Sent:</b> Wednesday, April 15, 2020 3:57 AM<br>
<b>To:</b> Jani Nikula <jani.nikula@linux.intel.com>; Alex Deucher <alexdeucher@gmail.com>; Bernard Zhao <bernard@vivo.com><br>
<b>Cc:</b> Sierra Guiza, Alejandro (Alex) <Alex.Sierra@amd.com>; Zeng, Oak <Oak.Zeng@amd.com>; Maling list - DRI developers <dri-devel@lists.freedesktop.org>; David Airlie <airlied@linux.ie>; Kuehling, Felix <Felix.Kuehling@amd.com>; LKML <linux-kernel@vger.kernel.org>;
amd-gfx list <amd-gfx@lists.freedesktop.org>; kernel@vivo.com <kernel@vivo.com>; Huang, Ray <Ray.Huang@amd.com>; Russell, Kent <Kent.Russell@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Sam Ravnborg <sam@ravnborg.org>; Yuan, Xiaojie <Xiaojie.Yuan@amd.com><br>
<b>Subject:</b> Re: [PATCH] Optimized division operation to shift operation</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">Am 15.04.20 um 09:41 schrieb Jani Nikula:<br>
> On Tue, 14 Apr 2020, Alex Deucher <alexdeucher@gmail.com> wrote:<br>
>> On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao <bernard@vivo.com> wrote:<br>
>>> On some processors, the / operate will call the compiler`s div lib,<br>
>>> which is low efficient, We can replace the / operation with shift,<br>
>>> so that we can replace the call of the division library with one<br>
>>> shift assembly instruction.<br>
> This was applied already, and it's not in a driver I look after... but<br>
> to me this feels like something that really should be<br>
> justified. Using >> instead of / for multiples of 2 division mattered 20<br>
> years ago, I'd be surprised if it still did on modern compilers.<br>
<br>
I have similar worries, especially since we replace the "/ (4 * 2)" with <br>
">> 3" it's making the code just a bit less readable.<br>
<br>
And that the code runs exactly once while loading the driver and pushing <br>
the firmware into the hardware. So performance is completely irrelevant <br>
here.<br>
<br>
Regards,<br>
Christian.<br>
<br>
><br>
> BR,<br>
> Jani.<br>
><br>
><br>
>>> Signed-off-by: Bernard Zhao <bernard@vivo.com><br>
>> Applied. thanks.<br>
>><br>
>> Alex<br>
>><br>
>>> ---<br>
>>> drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 4 ++--<br>
>>> drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 4 ++--<br>
>>> drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 4 ++--<br>
>>> 3 files changed, 6 insertions(+), 6 deletions(-)<br>
>>><br>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c<br>
>>> index b205039..66cd078 100644<br>
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c<br>
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c<br>
>>> @@ -175,10 +175,10 @@ static int gmc_v6_0_mc_load_microcode(struct amdgpu_device *adev)<br>
>>> amdgpu_ucode_print_mc_hdr(&hdr->header);<br>
>>><br>
>>> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version);<br>
>>> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2);<br>
>>> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3;<br>
>>> new_io_mc_regs = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes));<br>
>>> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4;<br>
>>> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2;<br>
>>> new_fw_data = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes));<br>
>>><br>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c<br>
>>> index 9da9596..ca26d63 100644<br>
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c<br>
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c<br>
>>> @@ -193,10 +193,10 @@ static int gmc_v7_0_mc_load_microcode(struct amdgpu_device *adev)<br>
>>> amdgpu_ucode_print_mc_hdr(&hdr->header);<br>
>>><br>
>>> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version);<br>
>>> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2);<br>
>>> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3;<br>
>>> io_mc_regs = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes));<br>
>>> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4;<br>
>>> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2;<br>
>>> fw_data = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes));<br>
>>><br>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c<br>
>>> index 27d83204..295039c 100644<br>
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c<br>
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c<br>
>>> @@ -318,10 +318,10 @@ static int gmc_v8_0_tonga_mc_load_microcode(struct amdgpu_device *adev)<br>
>>> amdgpu_ucode_print_mc_hdr(&hdr->header);<br>
>>><br>
>>> adev->gmc.fw_version = le32_to_cpu(hdr->header.ucode_version);<br>
>>> - regs_size = le32_to_cpu(hdr->io_debug_size_bytes) / (4 * 2);<br>
>>> + regs_size = le32_to_cpu(hdr->io_debug_size_bytes) >> 3;<br>
>>> io_mc_regs = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->io_debug_array_offset_bytes));<br>
>>> - ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4;<br>
>>> + ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes) >> 2;<br>
>>> fw_data = (const __le32 *)<br>
>>> (adev->gmc.fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes));<br>
>>><br>
>>> --<br>
>>> 2.7.4<br>
>>><br>
>>> _______________________________________________<br>
>>> amd-gfx mailing list<br>
>>> amd-gfx@lists.freedesktop.org<br>
>>> <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cchristian.koenig%40amd.com%7C1e91f7edcfe0473b0d7008d7e11074a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637225333103893889&sdata=VDJlEY2%2Bl1SSO8Fw1dYqqPFqQtyHpsxQ0Tm7iVOgJQY%3D&reserved=0">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cchristian.koenig%40amd.com%7C1e91f7edcfe0473b0d7008d7e11074a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637225333103893889&sdata=VDJlEY2%2Bl1SSO8Fw1dYqqPFqQtyHpsxQ0Tm7iVOgJQY%3D&reserved=0</a><br>
>> _______________________________________________<br>
>> dri-devel mailing list<br>
>> dri-devel@lists.freedesktop.org<br>
>> <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C1e91f7edcfe0473b0d7008d7e11074a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637225333103893889&sdata=EpqRRbCiksur%2BjMlVQplExuJsmw6UPODhyBOutOVukw%3D&reserved=0">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cchristian.koenig%40amd.com%7C1e91f7edcfe0473b0d7008d7e11074a8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637225333103893889&sdata=EpqRRbCiksur%2BjMlVQplExuJsmw6UPODhyBOutOVukw%3D&reserved=0</a><br>
<br>
</div>
</span></font></div>
</div>
</body>
</html>