[Advice Request] Trying to debug amdgpu fatal error

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Mon Apr 9 04:00:46 UTC 2018


Just from a quick look it seems to fail in amdgpu_device_init->ioremap 
with ENOMEM, that would explain why you don't see any more prints - this 
failure is very early in the device init process.

No idea why ioremap would fail in this case and not even sure which 
implementation of ioremap to look into for your case.

Adding Christian for this.

Andrey


On 04/07/2018 03:16 AM, Daniel Moran wrote:
> Also, to clarify... if I move it into a regular slot, turn off the 
> eGPU it works as expected.
> Tested with Intel iGPU enabled and disabled, made sure i915 loaded 
> without error and can connect display to it.
>
>
>
> Again, thank you in advance for any time/support offered.
>
> Respectfully,
> Daniel S. Moran (garwynn)
> PC Hardware Editor - XDA-Developers
> Phone: 1-559-316-0760/+81-90-5484-4155
> Article Links: http://www.xda-developers.com/author/garwynn
> E-mail: xdagarwynn at gmail.com <mailto:xdagarwynn at gmail.com> | Twitter: 
> @xdagarwynn
>
> On Sat, Apr 7, 2018 at 3:58 PM, Daniel Moran <xdagarwynn at gmail.com 
> <mailto:xdagarwynn at gmail.com>> wrote:
>
>     Hello all,
>
>     I've got a Powercolor Red Devil Vega 56 here that I'm trying to
>     get working in eGPU mode.
>     I think on the BIOS/hardware side it's now all fleshed out.
>     Now I'm at a point where amdgpu tries to init and reaches a fatal
>     error.
>
>     Set loglevel=8 doesn't get any additional messages.
>     Here's what it does report (full dmesg attached):
>
>     [  429.005909] [drm] amdgpu kernel modesetting enabled.
>     [  429.006080] [drm] initializing kernel modesetting (VEGA10
>     0x1002:0x687F 0x148C:0x2388 0xC3).
>     [  429.006082] amdgpu 0000:16:00.0: Fatal error during GPU init
>     [  429.006155] amdgpu: probe of 0000:16:00.0 failed with error -12
>
>     Using the following commands to unload & reload for testing. Since
>     it's as an eGPU I'm using the i7-7700K iGPU (i915 module) as the
>     primary and these commands work in terminal without requiring a
>     reboot.
>
>     sudo rmmod amdgpu
>     sudo modprobe -v amgpu
>
>     Pulled the UMR and tried to make, fails on Cmake. I'll attach log
>     in a text.
>     Also will attach a full dmesg and lspci dump. uname -a below:
>     /Linux testbox 4.15.15-041515-generic #201803311331 SMP Sat Mar 31
>     17:34:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux/
>
>     Any other ideas on how I can debug this further? Feel I'm so
>     close, don't want to let this go.
>     Thank you in advance for your time.
>
>     Respectfully,
>     Daniel S. Moran (garwynn)
>     PC Hardware Editor - XDA-Developers
>     Phone: 1-559-316-0760/+81-90-5484-4155
>     Article Links: http://www.xda-developers.com/author/garwynn
>     <http://www.xda-developers.com/author/garwynn>
>     E-mail: xdagarwynn at gmail.com <mailto:xdagarwynn at gmail.com> |
>     Twitter: @xdagarwynn
>
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180409/3dbb7e3c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2018-04-07 16-08-59.png
Type: image/png
Size: 60529 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180409/3dbb7e3c/attachment-0001.png>


More information about the amd-gfx mailing list