[error] Drm -> amdgpu Unrecoverable Machine Check
Yusuf Altıparmak
yusufalti1997 at gmail.com
Mon Dec 2 13:32:19 UTC 2019
>
>
> I attached my dts file.
>
> System is working fine when GPU is not plugged in.
>
> *This is the last console log before freeze:*
> [drm] amdgpu kernel modesetting enabled.
>
> [drm] initializing kernel modesetting (POLARIS12 0x1002:0x6987
> 0x1787:0x2389 0x80).
> [drm] register mmio base: 0x20200000
>
> fsl-fman-port ffe488000.port fm1-gb0: renamed from eth0
>
> [drm] register mmio size: 262144
>
> [drm] add ip block number 0 <vi_common>
>
> [drm] add ip block number 1 <gmc_v8_0>
>
> [drm] add ip block number 2 <tonga_ih>
>
> [drm] add ip block number 3 <powerplay>
>
> [drm] add ip block number 4 <dm>
>
> [drm] add ip block number 5 <gfx_v8_0>
>
> [drm] add ip block number 6 <sdma_v3_0>
>
> [drm] add ip block number 7 <uvd_v6_0>
>
> [drm] add ip block number 8 <vce_v3_0>
>
> [drm] UVD is enabled in VM mode
>
> [drm] UVD ENC is enabled in VM mode
>
> [drm] VCE enabled in VM mode
>
> ATOM BIOS: 113-ER16BFC-001
>
> [drm] GPU posting now...
>
> Disabling lock debugging due to kernel taint
>
> Machine check in kernel mode.
>
> Caused by (from MCSR=a000): Load Error Report
>
> Guarded Load Error Report
>
> Kernel panic - not syncing: Unrecoverable Machine check
>
> CPU: 1 PID: 2023 Comm: udevd Tainted: G M 4.19.26+gc0c2141
> #1
> Call Trace:
>
>
>
> _______________________________________________
> amd-gfx mailing listamd-gfx at lists.freedesktop.orghttps://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
Christian König <ckoenig.leichtzumerken at gmail.com>, 2 Ara 2019 Pzt, 15:28
tarihinde şunu yazdı:
> Hi Yusuf,
>
> Am 02.12.19 um 12:41 schrieb Yusuf Altıparmak:
>
> My embedded board is freezing when I put E9171 on PCIe. What is the
> meaning of Unrecoverable Machine Check error about GPU?
>
>
> Well see the explanation on Wikipedia for example:
> https://en.wikipedia.org/wiki/Machine-check_exception
>
> In general it means you have messed up something in your hardware
> configuration.
>
> Could PCIe settings in .dts file cause this problem?
>
>
> Possible, but rather unlikely. My best guess is that it is some problem
> with the power supply.
>
> If it is, is there any sample PCIe configuration for E9171?
>
>
> The E9171 is just a PCIe device, so the dtsi is actually rather
> uninteresting. What we really need is a full dmesg and maybe lspci output
> would help as well.
>
> Regards,
> Christian.
>
Hi Christian,
At first, I am using NXP T1042D4RDB-64B which has 256 MB PCIe buffer
according to its. PCIe memory range was arranged to 256 MB in .dts file and
in U-boot configuration file. Driver was giving error with exit code -12
(OUT_OF_MEMORY). But I was able to reach the linux console.
[ 5.512922] [drm] amdgpu kernel modesetting enabled.
[ 5.517065] [drm] initializing kernel modesetting (POLARIS12
0x1002:0x6987 0x1787:0x2389 0x80).
[ 5.524507] amdgpu 0001:01:00.0: Fatal error during GPU init
[ 5.529296] amdgpu: probe of 0001:01:00.0 failed with error -12
Then I canged 256 MB to 4GB in .dtsi and U-boot conf file. I also changed
64KB I/O size to 1MB . When I do this, I wasn't able to reach the linux
console because board was freezing. But driver was successfull at this
time. I already mentioned successfull driver console logs up.
*this is lspci -v when GPU is plugged and Memory size is 256 MB.*
root at t1042d4rdb-64b:~# lspci -v
0000:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 0824 (rev 11)
(prog-if 00 [Normal decode])
Device tree node: /sys/firmware/devicetree/base/pcie at ffe240000
/pcie at 0
Flags: bus master, fast devsel, latency 0, IRQ 20
Memory at <ignored> (32-bit, non-prefetchable)
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00000000-0000ffff [size=64K]
Memory behind bridge: e0000000-efffffff [size=256M]
Prefetchable memory behind bridge: None
Capabilities: [44] Power Management version 3
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: pcieport
0001:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 0824 (rev 11)
(prog-if 00 [Normal decode])
Device tree node: /sys/firmware/devicetree/base/pcie at ffe250000
/pcie at 0
Flags: bus master, fast devsel, latency 0, IRQ 21
Memory at <ignored> (32-bit, non-prefetchable)
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00000000-0000ffff [size=64K]
Memory behind bridge: e0000000-efffffff [size=256M]
Prefetchable memory behind bridge: None
Capabilities: [44] Power Management version 3
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: pcieport
0001:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
[AMD/ATI] Lexa [Radeon E9171 MCM] (rev 80) (prog-if 00 [VGA controller])
Subsystem: Hightech Information System Ltd. Device 2389
Flags: fast devsel, IRQ 41
Memory at c10000000 (64-bit, prefetchable) [size=256M]
Memory at <ignored> (64-bit, prefetchable)
I/O ports at 1100 [size=256]
Memory at <ignored> (32-bit, non-prefetchable)
Expansion ROM at <ignored> [disabled]
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Resizable BAR <?>
Capabilities: [270] Secondary PCI Express <?>
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
Kernel modules: amdgpu
0001:01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device
aae0
Subsystem: Hightech Information System Ltd. Device aae0
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at <ignored> (64-bit, non-prefetchable)
Capabilities: [48] Vendor Specific Information: Len=08 <?>
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1
Len=010 <?>
Capabilities: [150] Advanced Error Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
0002:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 0824 (rev 11)
(prog-if 00 [Normal decode])
Device tree node: /sys/firmware/devicetree/base/pcie at ffe260000
/pcie at 0
Flags: bus master, fast devsel, latency 0, IRQ 22
Memory at <ignored> (32-bit, non-prefetchable)
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00000000-0000ffff [size=64K]
Memory behind bridge: e0000000-efffffff [size=256M]
Prefetchable memory behind bridge: None
Capabilities: [44] Power Management version 3
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: pcieport
0003:00:00.0 PCI bridge: Freescale Semiconductor Inc Device 0824 (rev 11)
(prog-if 00 [Normal decode])
Device tree node: /sys/firmware/devicetree/base/pcie at ffe270000
/pcie at 0
Flags: bus master, fast devsel, latency 0, IRQ 23
Memory at <ignored> (32-bit, non-prefetchable)
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00000000-0000ffff [size=64K]
Memory behind bridge: e0000000-efffffff [size=256M]
Prefetchable memory behind bridge: None
Capabilities: [44] Power Management version 3
Capabilities: [4c] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: pcieport
*AND This is PCIe dmesg message when memory range is 256MB. It's also
giving same message when memory range is arranged as 4GB;*
PCI host bridge /pcie at ffe240000 ranges:
MEM 0x0000000c00000000..0x0000000c0fffffff -> 0x00000000e0000000
IO 0x0000000ff8000000..0x0000000ff800ffff -> 0x0000000000000000
/pcie at ffe240000: PCICSRBAR @ 0xff000000
setup_pci_atmu: end of DRAM 200000000
/pcie at ffe240000: Setup 64-bit PCI DMA window
/pcie at ffe240000: WARNING: Outbound window cfg leaves gaps in memory map.
Adjusting the memory map could reduce unnecessary bounce buffering.
/pcie at ffe240000: DMA window size is 0xe0000000
Found FSL PCI host bridge at 0x0000000ffe250000. Firmware bus number: 0->1
PCI host bridge /pcie at ffe250000 ranges:
MEM 0x0000000c10000000..0x0000000c1fffffff -> 0x00000000e0000000
IO 0x0000000ff8010000..0x0000000ff801ffff -> 0x0000000000000000
/pcie at ffe250000: PCICSRBAR @ 0xff000000
setup_pci_atmu: end of DRAM 200000000
/pcie at ffe250000: Setup 64-bit PCI DMA window
/pcie at ffe250000: WARNING: Outbound window cfg leaves gaps in memory map.
Adjusting the memory map could reduce unnecessary bounce buffering.
/pcie at ffe250000: DMA window size is 0xe0000000
Found FSL PCI host bridge at 0x0000000ffe260000. Firmware bus number: 0->0
PCI host bridge /pcie at ffe260000 ranges:
MEM 0x0000000c20000000..0x0000000c2fffffff -> 0x00000000e0000000
IO 0x0000000ff8020000..0x0000000ff802ffff -> 0x0000000000000000
/pcie at ffe260000: PCICSRBAR @ 0xff000000
setup_pci_atmu: end of DRAM 200000000
/pcie at ffe260000: Setup 64-bit PCI DMA window
/pcie at ffe260000: WARNING: Outbound window cfg leaves gaps in memory map.
Adjusting the memory map could reduce unnecessary bounce buffering.
/pcie at ffe260000: DMA window size is 0xe0000000
Found FSL PCI host bridge at 0x0000000ffe270000. Firmware bus number: 0->0
PCI host bridge /pcie at ffe270000 ranges:
MEM 0x0000000c30000000..0x0000000c3fffffff -> 0x00000000e0000000
IO 0x0000000ff8030000..0x0000000ff803ffff -> 0x0000000000000000
/pcie at ffe270000: PCICSRBAR @ 0xff000000
setup_pci_atmu: end of DRAM 200000000
/pcie at ffe270000: Setup 64-bit PCI DMA window
/pcie at ffe270000: WARNING: Outbound window cfg leaves gaps in memory map.
Adjusting the memory map could reduce unnecessary bounce buffering.
/pcie at ffe270000: DMA window size is 0xe0000000
iommu: Adding device ff6000000.qman-portal to group 0
iommu: Adding device ff6004000.qman-portal to group 1
iommu: Adding device ff6008000.qman-portal to group 2
iommu: Adding device ff600c000.qman-portal to group 3
iommu: Adding device ff6010000.qman-portal to group 4
iommu: Adding device ff6014000.qman-portal to group 5
iommu: Adding device ff6018000.qman-portal to group 6
iommu: Adding device ff601c000.qman-portal to group 7
iommu: Adding device ff6020000.qman-portal to group 8
iommu: Adding device ff6024000.qman-portal to group 9
iommu: Adding device ffe100300.dma to group 10
iommu: Adding device ffe101300.dma to group 11
iommu: Adding device ffe114000.sdhc to group 12
iommu: Adding device ffe210000.usb to group 13
iommu: Adding device ffe211000.usb to group 14
iommu: Adding device ffe220000.sata to group 15
iommu: Adding device ffe221000.sata to group 16
iommu: Adding device ffe318000.qman to group 17
iommu: Adding device ffe31a000.bman to group 18
iommu: Adding device ffe240000.pcie to group 19
iommu: Adding device ffe250000.pcie to group 20
iommu: Adding device ffe260000.pcie to group 21
iommu: Adding device ffe270000.pcie to group 22
iommu: Adding device ffe140000.qe to group 23
software IO TLB: mapped [mem 0xfbfff000-0xfffff000] (64MB)
PCI: Probing PCI hardware
fsl-pci ffe240000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [io
0x8000080000010000-0x800008000001ffff] (bus address [0x0000-0xffff])
pci_bus 0000:00: root bus resource [mem 0xc00000000-0xc0fffffff] (bus
address [0xe0000000-0xefffffff])
pci_bus 0000:00: root bus resource [bus 00]
iommu: Removing device ffe240000.pcie from group 19
iommu: Adding device 0000:00:00.0 to group 24
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
fsl-pci ffe250000.pcie: PCI host bridge to bus 0001:00
pci_bus 0001:00: root bus resource [io
0x8000080000021000-0x8000080000030fff] (bus address [0x0000-0xffff])
pci_bus 0001:00: root bus resource [mem 0xc10000000-0xc1fffffff] (bus
address [0xe0000000-0xefffffff])
pci_bus 0001:00: root bus resource [bus 00-01]
iommu: Removing device ffe250000.pcie from group 20
iommu: Adding device 0001:00:00.0 to group 19
pci 0001:01:00.0: enabling Extended Tags
pci 0001:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5 GT/s x1
link at 0001:00:00.0 (capable of 63.008 Gb/s with 8 GT/s x8 link)
iommu: Adding device 0001:01:00.0 to group 19
pci 0001:01:00.1: enabling Extended Tags
iommu: Adding device 0001:01:00.1 to group 19
pci 0001:00:00.0: PCI bridge to [bus 01-ff]
fsl-pci ffe260000.pcie: PCI host bridge to bus 0002:00
pci_bus 0002:00: root bus resource [io
0x8000080000032000-0x8000080000041fff] (bus address [0x0000-0xffff])
pci_bus 0002:00: root bus resource [mem 0xc20000000-0xc2fffffff] (bus
address [0xe0000000-0xefffffff])
pci_bus 0002:00: root bus resource [bus 00]
iommu: Removing device ffe260000.pcie from group 21
iommu: Adding device 0002:00:00.0 to group 20
pci 0002:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0002:00:00.0: PCI bridge to [bus 01-ff]
fsl-pci ffe270000.pcie: PCI host bridge to bus 0003:00
pci_bus 0003:00: root bus resource [io
0x8000080000043000-0x8000080000052fff] (bus address [0x0000-0xffff])
pci_bus 0003:00: root bus resource [mem 0xc30000000-0xc3fffffff] (bus
address [0xe0000000-0xefffffff])
pci_bus 0003:00: root bus resource [bus 00]
iommu: Removing device ffe270000.pcie from group 22
iommu: Adding device 0003:00:00.0 to group 21
pci 0003:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0003:00:00.0: PCI bridge to [bus 01-ff]
PCI: Cannot allocate resource region 0 of device 0000:00:00.0, will remap
PCI: Cannot allocate resource region 0 of device 0001:00:00.0, will remap
PCI: Cannot allocate resource region 2 of device 0001:01:00.0, will remap
PCI: Cannot allocate resource region 5 of device 0001:01:00.0, will remap
PCI: Cannot allocate resource region 6 of device 0001:01:00.0, will remap
PCI: Cannot allocate resource region 0 of device 0001:01:00.1, will remap
PCI: Cannot allocate resource region 0 of device 0002:00:00.0, will remap
PCI: Cannot allocate resource region 0 of device 0003:00:00.0, will remap
pci 0000:00:00.0: BAR 0: no space for [mem size 0x01000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x01000000]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0: bridge window [io
0x8000080000010000-0x800008000001ffff]
pci 0000:00:00.0: bridge window [mem 0xc00000000-0xc0fffffff]
pci_bus 0000:00: Some PCI device resources are unassigned, try booting with
pci=realloc
pci 0001:00:00.0: BAR 0: no space for [mem size 0x01000000]
pci 0001:00:00.0: BAR 0: failed to assign [mem size 0x01000000]
pci 0001:00:00.0: BAR 9: no space for [mem size 0x00200000 64bit pref]
pci 0001:00:00.0: BAR 9: failed to assign [mem size 0x00200000 64bit pref]
pci 0001:01:00.0: BAR 2: no space for [mem size 0x00200000 64bit pref]
pci 0001:01:00.0: BAR 2: failed to assign [mem size 0x00200000 64bit pref]
pci 0001:01:00.0: BAR 5: no space for [mem size 0x00040000]
pci 0001:01:00.0: BAR 5: failed to assign [mem size 0x00040000]
pci 0001:01:00.0: BAR 6: no space for [mem size 0x00020000 pref]
pci 0001:01:00.0: BAR 6: failed to assign [mem size 0x00020000 pref]
pci 0001:01:00.1: BAR 0: no space for [mem size 0x00004000 64bit]
pci 0001:01:00.1: BAR 0: failed to assign [mem size 0x00004000 64bit]
pci 0001:00:00.0: PCI bridge to [bus 01]
pci 0001:00:00.0: bridge window [io
0x8000080000021000-0x8000080000030fff]
pci 0001:00:00.0: bridge window [mem 0xc10000000-0xc1fffffff]
pci_bus 0001:00: Some PCI device resources are unassigned, try booting with
pci=realloc
pci 0002:00:00.0: BAR 0: no space for [mem size 0x01000000]
pci 0002:00:00.0: BAR 0: failed to assign [mem size 0x01000000]
pci 0002:00:00.0: PCI bridge to [bus 01]
pci 0002:00:00.0: bridge window [io
0x8000080000032000-0x8000080000041fff]
pci 0002:00:00.0: bridge window [mem 0xc20000000-0xc2fffffff]
pci_bus 0002:00: Some PCI device resources are unassigned, try booting with
pci=realloc
pci 0003:00:00.0: BAR 0: no space for [mem size 0x01000000]
pci 0003:00:00.0: BAR 0: failed to assign [mem size 0x01000000]
pci 0003:00:00.0: PCI bridge to [bus 01]
pci 0003:00:00.0: bridge window [io
0x8000080000043000-0x8000080000052fff]
pci 0003:00:00.0: bridge window [mem 0xc30000000-0xc3fffffff]
pci_bus 0003:00: Some PCI device resources are unassigned, try booting with
pci=realloc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20191202/f7af10f8/attachment-0001.html>
More information about the amd-gfx
mailing list