On some server with MGA G200e (rev 42), booting with Legacy BIOS, The hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is a bad configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (if memmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current driver code set it to 0x05 => memmapsl to b01 => 0xA0000 but on x86, the VGA console is at 0xB8000 arch/x86/boot/compressed/misc.c define vidmem to 0xb8000 in extract_kernel() so it's better to configure it to b11 Thus changing the value 0x05 to 0x0d
If some other architectures require the VGA memory to be at 0xA0000, I can write another patch which won't change the memmapsl bits, (so assuming the BIOS or UEFI already set it to the right value). Another solution would be to set it to 0x0d only on x86.
Let me know what you think is the best way to fix it.
On some server with MGA G200e (rev 42), booting with Legacy BIOS, The hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is a bad configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (if memmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current driver code set it to 0x05 => memmapsl to b01 => 0xA0000 but on x86, the VGA console is at 0xB8000 arch/x86/boot/compressed/misc.c define vidmem to 0xb8000 in extract_kernel() so it's better to configure it to b11 Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com --- drivers/gpu/drm/mgag200/mgag200_mode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..c7f63610b278 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,7 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40); - WREG_GFX(6, 0x05); + WREG_GFX(6, 0x0d); WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
Hello Jocelyn,
On 1/14/22 10:47, Jocelyn Falempe wrote:
On some server with MGA G200e (rev 42), booting with Legacy BIOS, The hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is a bad configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (if memmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current driver code set it to 0x05 => memmapsl to b01 => 0xA0000 but on x86, the VGA console is at 0xB8000
I think this need some rewording after imirkin's explanation that 0xA0000 is the address of the VGA video memory and 0xB8000 the address of the VGA text buffer.
arch/x86/boot/compressed/misc.c define vidmem to 0xb8000 in extract_kernel() so it's better to configure it to b11 Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com
drivers/gpu/drm/mgag200/mgag200_mode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..c7f63610b278 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,7 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40);
- WREG_GFX(6, 0x05);
- WREG_GFX(6, 0x0d);
My worry is if this could cause other issues so I would only do this change if (is_kdump_kernel()), to make it as non intrusive as possible. And also add a verbose comment about why this is needed.
If you make those changes, feel free to add:
Reviewed-by: Javier Martinez Canillas javierm@redhat.com
Best regards,
On 18/01/2022 17:38, Javier Martinez Canillas wrote:
Hello Jocelyn,
On 1/14/22 10:47, Jocelyn Falempe wrote:
My worry is if this could cause other issues so I would only do this change if (is_kdump_kernel()), to make it as non intrusive as possible. And also add a verbose comment about why this is needed.
This change must be done in the "first" kernel, so that when kdump starts, it doesn't hang the machine by writing to the VGA interface, in the early boot code.
To make this change less intrusive, we can do it only on problematic hardware (G200_SE rev 42), but Thomas said it was probably not needed.
If you make those changes, feel free to add:
Reviewed-by: Javier Martinez Canillas javierm@redhat.com
Best regards,
On 1/18/22 17:52, Jocelyn Falempe wrote:
On 18/01/2022 17:38, Javier Martinez Canillas wrote:
Hello Jocelyn,
On 1/14/22 10:47, Jocelyn Falempe wrote:
My worry is if this could cause other issues so I would only do this change if (is_kdump_kernel()), to make it as non intrusive as possible. And also add a verbose comment about why this is needed.
This change must be done in the "first" kernel, so that when kdump starts, it doesn't hang the machine by writing to the VGA interface, in the early boot code.
Ah, got it. The patch then makes sense to me as is in that case.
My comment about documenting why this is needed still applies though.
Best regards,
On 18/01/2022 18:17, Javier Martinez Canillas wrote:
On 1/18/22 17:52, Jocelyn Falempe wrote:
On 18/01/2022 17:38, Javier Martinez Canillas wrote:
Hello Jocelyn,
On 1/14/22 10:47, Jocelyn Falempe wrote:
My worry is if this could cause other issues so I would only do this change if (is_kdump_kernel()), to make it as non intrusive as possible. And also add a verbose comment about why this is needed.
This change must be done in the "first" kernel, so that when kdump starts, it doesn't hang the machine by writing to the VGA interface, in the early boot code.
Ah, got it. The patch then makes sense to me as is in that case.
My comment about documenting why this is needed still applies though.
Yes, I will fix the commit message, and add a comment in the code. I didn't know 0xA0000 was the graphic mode, so I though the configuration was a mistake. But it turns out, the current configuration is good, but as the driver don't use this address, and kdump fails if this address is not VGA text mode on some hardware, it's better to set it to 0xb8000.
Best regards,
Thanks,
We should probably Cc: stable@vger.kernel.org this as well, see:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for more info. As well, some useful tools for adding the appropriate Fixes: tags:
https://drm.pages.freedesktop.org/maintainer-tools/dim.html
At least on my end this is:
Acked-by: Lyude Paul lyude@redhat.com
I'd very much like Thomas Zimmerman to verify that this patch is OK though with an R-b before we push anything upstream.
On Fri, 2022-01-14 at 10:47 +0100, Jocelyn Falempe wrote:
On some server with MGA G200e (rev 42), booting with Legacy BIOS, The hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is a bad configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (if memmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current driver code set it to 0x05 => memmapsl to b01 => 0xA0000 but on x86, the VGA console is at 0xB8000 arch/x86/boot/compressed/misc.c define vidmem to 0xb8000 in extract_kernel() so it's better to configure it to b11 Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com
drivers/gpu/drm/mgag200/mgag200_mode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..c7f63610b278 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,7 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40); - WREG_GFX(6, 0x05); + WREG_GFX(6, 0x0d); WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
Hi
Am 18.01.22 um 20:06 schrieb Lyude Paul:
We should probably Cc: stable@vger.kernel.org this as well, see:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for more info. As well, some useful tools for adding the appropriate Fixes: tags:
https://drm.pages.freedesktop.org/maintainer-tools/dim.html
At least on my end this is:
Acked-by: Lyude Paul lyude@redhat.com
I'd very much like Thomas Zimmerman to verify that this patch is OK though with an R-b before we push anything upstream.
Yep, I'll give it a try on my test system. I'll also add a TODO comment that summarizes the situation.
A real fix would detect that the kdump kernel is running and not use the display then.
Best regards Thomas
On Fri, 2022-01-14 at 10:47 +0100, Jocelyn Falempe wrote:
On some server with MGA G200e (rev 42), booting with Legacy BIOS, The hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is a bad configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (if memmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current driver code set it to 0x05 => memmapsl to b01 => 0xA0000 but on x86, the VGA console is at 0xB8000 arch/x86/boot/compressed/misc.c define vidmem to 0xb8000 in extract_kernel() so it's better to configure it to b11 Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com
drivers/gpu/drm/mgag200/mgag200_mode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..c7f63610b278 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,7 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40); - WREG_GFX(6, 0x05); + WREG_GFX(6, 0x0d); WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
On some servers with MGA G200_SE_A (rev 42), booting with Legacy BIOS, the hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is the configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (ifmemmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current code set it to 0x05 => memmapsl to b01 => 0xa0000 (graphic mode) But on x86, the VGA console is at 0xb8000 (text mode) In arch/x86/boot/compressed/misc.c debug strings are written to 0xb8000 As the driver doesn't use this mapping at 0xa0000, it is safe to set it to 0xb8000 instead, to avoid kernel hang on G200_SE_A rev42, with kexec/kdump.
Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com Reviewed-by: Javier Martinez Canillas javierm@redhat.com Acked-by: Lyude Paul lyude@redhat.com Cc: stable@vger.kernel.org ---
v2: Add clear statement that it's not the right configuration, but it prevents an annoying bug with kexec/kdump.
drivers/gpu/drm/mgag200/mgag200_mode.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..cd9ba13ad5fc 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,10 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40); - WREG_GFX(6, 0x05); + /* GCTL6 should be 0x05, but we configure memmapsl to 0xb8000 (text mode), + * so that it doesn't hang when running kexec/kdump on G200_SE rev42. + */ + WREG_GFX(6, 0x0d); WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
Hi,
Am 19.01.22 um 11:29 schrieb Jocelyn Falempe:
On some servers with MGA G200_SE_A (rev 42), booting with Legacy BIOS, the hardware hangs when using kdump and kexec into the kdump kernel. This happens when the uncompress code tries to write "Decompressing Linux" to the VGA Console.
It can be reproduced by writing to the VGA console (0xB8000) after booting to graphic mode, it generates the following error:
kernel:NMI: PCI system error (SERR) for reason a0 on CPU 0. kernel:Dazed and confused, but trying to continue
The root cause is the configuration of the MGA GCTL6 register
According to the GCTL6 register documentation:
bit 0 is gcgrmode: 0: Enables alpha mode, and the character generator addressing system is activated. 1: Enables graphics mode, and the character addressing system is not used.
bit 1 is chainodd even: 0: The A0 signal of the memory address bus is used during system memory addressing. 1: Allows A0 to be replaced by either the A16 signal of the system address (ifmemmapsl is ‘00’), or by the hpgoddev (MISC<5>, odd/even page select) field, described on page 3-294).
bit 3-2 are memmapsl: Memory map select bits 1 and 0. VGA. These bits select where the video memory is mapped, as shown below: 00 => A0000h - BFFFFh 01 => A0000h - AFFFFh 10 => B0000h - B7FFFh 11 => B8000h - BFFFFh
bit 7-4 are reserved.
Current code set it to 0x05 => memmapsl to b01 => 0xa0000 (graphic mode) But on x86, the VGA console is at 0xb8000 (text mode) In arch/x86/boot/compressed/misc.c debug strings are written to 0xb8000 As the driver doesn't use this mapping at 0xa0000, it is safe to set it to 0xb8000 instead, to avoid kernel hang on G200_SE_A rev42, with kexec/kdump.
Thus changing the value 0x05 to 0x0d
Signed-off-by: Jocelyn Falempe jfalempe@redhat.com Reviewed-by: Javier Martinez Canillas javierm@redhat.com Acked-by: Lyude Paul lyude@redhat.com Cc: stable@vger.kernel.org
v2: Add clear statement that it's not the right configuration, but it prevents an annoying bug with kexec/kdump.
drivers/gpu/drm/mgag200/mgag200_mode.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c index b983541a4c53..cd9ba13ad5fc 100644 --- a/drivers/gpu/drm/mgag200/mgag200_mode.c +++ b/drivers/gpu/drm/mgag200/mgag200_mode.c @@ -529,7 +529,10 @@ static void mgag200_set_format_regs(struct mga_device *mdev, WREG_GFX(3, 0x00); WREG_GFX(4, 0x00); WREG_GFX(5, 0x40);
- WREG_GFX(6, 0x05);
- /* GCTL6 should be 0x05, but we configure memmapsl to 0xb8000 (text mode),
* so that it doesn't hang when running kexec/kdump on G200_SE rev42.
*/
- WREG_GFX(6, 0x0d);
Appears to be working on my test machine.
But please rune scripts/checkpatch.pl on the patch before sending it. I get several errors
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#98:
0: Enables alpha mode, and the character generator addressing system is
ERROR: trailing whitespace
#149: FILE: drivers/gpu/drm/mgag200/mgag200_mode.c:532:
+^I/* GCTL6 should be 0x05, but we configure memmapsl to 0xb8000 (text mode),^M$
ERROR: trailing whitespace
#150: FILE: drivers/gpu/drm/mgag200/mgag200_mode.c:533:
+^I * so that it doesn't hang when running kexec/kdump on G200_SE rev42.^M$
Best regards Thomas
WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
Hi
Am 19.01.22 um 13:21 schrieb Thomas Zimmermann:
Appears to be working on my test machine.
But please rune scripts/checkpatch.pl on the patch before sending it. I get several errors
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#98:
0: Enables alpha mode, and the character generator addressing system is
ERROR: trailing whitespace
#149: FILE: drivers/gpu/drm/mgag200/mgag200_mode.c:532:
+^I/* GCTL6 should be 0x05, but we configure memmapsl to 0xb8000 (text mode),^M$
ERROR: trailing whitespace
#150: FILE: drivers/gpu/drm/mgag200/mgag200_mode.c:533:
+^I * so that it doesn't hang when running kexec/kdump on G200_SE rev42.^M$
Thanks a lot, the patch has been merge now. These problems might have been caused by my email client.
Best regards Thomas
Best regards Thomas
WREG_GFX(7, 0x0f); WREG_GFX(8, 0x0f);
dri-devel@lists.freedesktop.org