Re: Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
Alex Deucher
alexdeucher at gmail.com
Fri Dec 17 21:19:01 UTC 2021
If you could get me a copy of the vbios image from a problematic board,
that would be helpful. In the meantime, I've applied the patch.
Alex
On Thu, Dec 16, 2021 at 9:38 PM 周宗敏 <zhouzongmin at kylinos.cn> wrote:
> Dear Alex:
>
>
> >Is the issue reproducible with the same board in bare metal on x86?Or
> does it only happen with passthrough on ARM?
>
>
> Unfortunately, my current environment is not convenient to test this GPU
> board on x86 platform.
>
> but I can tell you the problem still occurs on ARM without passthrough to
> virtual machine.
>
>
> In addition,at end of 2020,my colleagues also found similar problems on
> MIPS platforms with Graphics chips of Radeon R7 340.
>
> So,I may think it can happen to no matter based on x86 ,ARM or mips.
>
>
> I hope the above information is helpful to you,and I also think it will be
> better for user if can root cause this issue.
>
>
> Best regards.
>
>
>
>
> ----
>
>
>
>
>
>
> *主 题:*Re: Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
>
> *日 期:*2021-12-16 23:28
> *发件人:*Alex Deucher
> *收件人:*周宗敏
>
>
> Is the issue reproducible with the same board in bare metal on x86? Or
> does it only happen with passthrough on ARM? Looking through the archives,
> the SI patch I made was for an x86 laptop. It would be nice to root
> cause this, but there weren't any gfx8 boards with more than 64G of vram,
> so I think it's safe. That said, if you see similar issues with newer gfx
> IPs then we have an issue since the upper bit will be meaningful, so it
> would be nice to root cause this.
>
> Alex
>
>
> On Thu, Dec 16, 2021 at 4:36 AM 周宗敏 <zhouzongmin at kylinos.cn> wrote:
>
>> Hi Christian,
>>
>>
>> I'm testing for GPU passthrough feature, so I pass through this GPU to
>> virtual machine to use. It based on arm64 system.
>>
>> As far as i know, Alex had dealt with a similar problems on
>> dri/radeon/si.c . Maybe they have a same reason to cause it?
>>
>> the history commit message is below:
>>
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ca223b029a261e82fb2f50c52eb85d510f4260e
>>
>> [image: image.png]
>>
>>
>> Thanks very much.
>>
>>
>>
>> ----
>>
>>
>>
>> *主 题:*Re: 回复: Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
>>
>> *日 期:*2021-12-16 16:15
>> *发件人:*Christian König
>> *收件人:*周宗敏Alex Deucher
>>
>>
>>
>>
>> Hi Zongmin,
>>
>> that strongly sounds like the ASIC is not correctly initialized when
>> trying to read the register.
>>
>> What board and environment are you using this GPU with? Is that a
>> normal x86 system?
>>
>> Regards,
>> Christian.
>>
>>
>>
>> Am 16.12.21 um 04:11 schrieb 周宗敏:
>>
>>
>>
>> 1.
>>
>> the problematic boards that I have tested is [AMD/ATI] Lexa
>> PRO [Radeon RX 550/550X] ; and the vbios version :
>> 113-RXF9310-C09-BT
>> 2.
>>
>> When an exception occurs I can see the following changes in
>> the values of vram size get from RREG32(mmCONFIG_MEMSIZE) ,
>>
>> it seems to have garbage in the upper 16 bits
>>
>> [image: image.png]
>>
>>
>>
>>
>> 3.
>>
>> and then I can also see some dmesg like below:
>>
>> when vram size register have garbage,we may see error
>> message like below:
>>
>> amdgpu 0000:09:00.0: VRAM: 4286582784M 0x000000F400000000 -
>> 0x000FF8F4FFFFFFFF (4286582784M used)
>>
>> the correct message should like below:
>>
>> amdgpu 0000:09:00.0: VRAM: 4096M 0x000000F400000000 -
>> 0x000000F4FFFFFFFF (4096M used)
>>
>>
>>
>>
>> if you have any problems,please send me mail.
>>
>> thanks very much.
>>
>>
>>
>>
>> ----
>>
>> *主 题:*Re: [PATCH] drm/amdgpu: fixup bad vram size on gmc v8
>>
>> *日 期:*2021-12-16 04:23
>> *发件人:*Alex Deucher
>> *收件人:*Zongmin Zhou
>>
>>
>>
>>
>> On Wed, Dec 15, 2021 at 10:31 AM Zongmin Zhouwrote:
>> >
>> > Some boards(like RX550) seem to have garbage in the upper
>> > 16 bits of the vram size register. Check for
>> > this and clamp the size properly. Fixes
>> > boards reporting bogus amounts of vram.
>> >
>> > after add this patch,the maximum GPU VRAM size is 64GB,
>> > otherwise only 64GB vram size will be used.
>>
>> Can you provide some examples of problematic boards and
>> possibly a
>> vbios image from the problematic board? What values are you
>> seeing?
>> It would be nice to see what the boards are reporting and
>> whether the
>> lower 16 bits are actually correct or if it is some other
>> issue. This
>> register is undefined until the asic has been initialized.
>> The vbios
>> programs it as part of it's asic init sequence (either via
>> vesa/gop or
>> the OS driver).
>>
>> Alex
>>
>>
>> >
>> > Signed-off-by: Zongmin Zhou
>> > ---
>> > drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 13
>> ++++++++++---
>> > 1 file changed, 10 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> > index 492ebed2915b..63b890f1e8af 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> > @@ -515,10 +515,10 @@ static void
>> gmc_v8_0_mc_program(struct amdgpu_device *adev)
>> > static int gmc_v8_0_mc_init(struct amdgpu_device
>> *adev)
>> > {
>> > int r;
>> > + u32 tmp;
>> >
>> > adev->gmc.vram_width =
>> amdgpu_atombios_get_vram_width(adev);
>> > if (!adev->gmc.vram_width) {
>> > - u32 tmp;
>> > int chansize, numchan;
>> >
>> > /* Get VRAM informations */
>> > @@ -562,8 +562,15 @@ static int gmc_v8_0_mc_init(struct
>> amdgpu_device *adev)
>> > adev->gmc.vram_width = numchan *
>> chansize;
>> > }
>> > /* size in MB on si */
>> > - adev->gmc.mc_vram_size =
>> RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>> > - adev->gmc.real_vram_size =
>> RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>> > + tmp = RREG32(mmCONFIG_MEMSIZE);
>> > + /* some boards may have garbage in the upper 16
>> bits */
>> > + if (tmp & 0xffff0000) {
>> > + DRM_INFO("Probable bad vram size:
>> 0x%08x\n", tmp);
>> > + if (tmp & 0xffff)
>> > + tmp &= 0xffff;
>> > + }
>> > + adev->gmc.mc_vram_size = tmp * 1024ULL *
>> 1024ULL;
>> > + adev->gmc.real_vram_size =
>> adev->gmc.mc_vram_size;
>> >
>> > if (!(adev->flags & AMD_IS_APU)) {
>> > r = amdgpu_device_resize_fb_bar(adev);
>> > --
>> > 2.25.1
>> >
>> >
>> > No virus found
>> > Checked by Hillstone Network AntiVirus
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20211217/9ca5cc58/attachment-0001.htm>
More information about the amd-gfx
mailing list