slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")

Liang, Liang (Leo) Liang.Liang at amd.com
Tue Mar 16 11:02:50 UTC 2021


[AMD Public Use]

Hi David and Mike,

It's BIOS buggy. Now fixed by new BIOS. Thanks you so much! Cheers!

[    0.000034] MTRR variable ranges enabled:
[    0.000035]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000037]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000039]   2 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000040]   3 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000041]   4 disabled
[    0.000042]   5 disabled
[    0.000043]   6 disabled
[    0.000044]   7 disabled
[    0.000045] TOM2: 0000000280000000 aka 10240M

root at scbu-Chachani:/home/scbu# cat /proc/mtrr
reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
reg02: base=0x0ffde0000 ( 4093MB), size=  128KB, count=1: write-protect
reg03: base=0x0ff000000 ( 4080MB), size=  512KB, count=1: write-protect

BRs,
Leo
-----Original Message-----
From: Mike Rapoport <rppt at linux.ibm.com> 
Sent: Tuesday, March 16, 2021 6:30 PM
To: David Hildenbrand <david at redhat.com>
Cc: Liang, Liang (Leo) <Liang.Liang at amd.com>; Deucher, Alexander <Alexander.Deucher at amd.com>; linux-kernel at vger.kernel.org; amd-gfx list <amd-gfx at lists.freedesktop.org>; Andrew Morton <akpm at linux-foundation.org>; Huang, Ray <Ray.Huang at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>; Rafael J. Wysocki <rafael at kernel.org>; George Kennedy <george.kennedy at oracle.com>
Subject: Re: slow boot with 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()")

On Tue, Mar 16, 2021 at 10:08:10AM +0100, David Hildenbrand wrote:
> On 16.03.21 09:58, Liang, Liang (Leo) wrote:
> > [AMD Public Use]
> > 
> > Hi David,
> > 
> > root at scbu-Chachani:~# cat /proc/mtrr
> > reg00: base=0x000000000 (    0MB), size= 2048MB, count=1: write-back
> > reg01: base=0x0ffe00000 ( 4094MB), size=    2MB, count=1: write-protect
> > reg02: base=0x100000000 ( 4096MB), size=   16MB, count=1: write-protect
> 
> ^ there it is
> 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki
> .osdev.org%2FMTRR&data=04%7C01%7CLiang.Liang%40amd.com%7C49c791cc6
> 18745b8c35208d8e86679a1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6
> 37514874126576401%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoi
> V2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=667IK3Bnyx5uP3
> rKN8bOjW7A2MBuM1sLCg98m1LCIGo%3D&reserved=0
> 
> "Reads allocate cache lines on a cache miss. All writes update main memory.
> 
> Cache lines are not allocated on a write miss. Write hits invalidate 
> the cache line and update main memory. "
> 
> AFAIU, writes completely bypass caches and store directly to main 
> mamory. If there are cache lines from a previous read, they are 
> invalidated. So I think especially slow will be read(addr), 
> write(addr), read(addr), ... which is what we have in the kstream benchmark.
> 
> 
> The question is:
> 
> who sets this up without owning the memory?
> Is the memory actually special/slow or is that setting wrong?

I really doubt that 16M at 0x100000000 in a system with 8G RAM would
*physically* differ from the neighbouring memory.

> Buggy firmware/BIOS?
> Buggy device driver?

[    0.000027] MTRR default type: uncachable
[    0.000028] MTRR fixed ranges enabled:
[    0.000030]   00000-9FFFF write-back
[    0.000031]   A0000-BFFFF uncachable
[    0.000032]   C0000-FFFFF write-through
[    0.000033] MTRR variable ranges enabled:
[    0.000034]   0 base 000000000000 mask FFFF80000000 write-back
[    0.000036]   1 base 0000FFE00000 mask FFFFFFE00000 write-protect
[    0.000037]   2 base 000100000000 mask FFFFFF000000 write-protect

As we have the range at 0x100000000 write-protected reported that early in boot I'd say it's BIOS.

The question is how to reliably detect that this is a bogus setting...

[    0.000038]   3 base 0000FFDE0000 mask FFFFFFFE0000 write-protect
[    0.000039]   4 base 0000FF000000 mask FFFFFFF80000 write-protect
[    0.000040]   5 disabled
[    0.000041]   6 disabled
[    0.000042]   7 disabled
[    0.000042] TOM2: 0000000280000000 aka 10240M


--
Sincerely yours,
Mike.


More information about the amd-gfx mailing list