[Advice Request] Trying to debug amdgpu fatal error
Christian König
christian.koenig at amd.com
Mon Apr 9 06:52:33 UTC 2018
Please provide the full dmesg of the system as well as the output of
"lspci -s 0000:16:00.0 -vvvv" as attachment.
Thanks,
Christian.
Am 09.04.2018 um 06:00 schrieb Andrey Grodzovsky:
>
> Just from a quick look it seems to fail in amdgpu_device_init->ioremap
> with ENOMEM, that would explain why you don't see any more prints -
> this failure is very early in the device init process.
>
> No idea why ioremap would fail in this case and not even sure which
> implementation of ioremap to look into for your case.
>
> Adding Christian for this.
>
> Andrey
>
>
> On 04/07/2018 03:16 AM, Daniel Moran wrote:
>> Also, to clarify... if I move it into a regular slot, turn off the
>> eGPU it works as expected.
>> Tested with Intel iGPU enabled and disabled, made sure i915 loaded
>> without error and can connect display to it.
>>
>>
>>
>> Again, thank you in advance for any time/support offered.
>>
>> Respectfully,
>> Daniel S. Moran (garwynn)
>> PC Hardware Editor - XDA-Developers
>> Phone: 1-559-316-0760/+81-90-5484-4155
>> Article Links: http://www.xda-developers.com/author/garwynn
>> E-mail: xdagarwynn at gmail.com <mailto:xdagarwynn at gmail.com> | Twitter:
>> @xdagarwynn
>>
>> On Sat, Apr 7, 2018 at 3:58 PM, Daniel Moran <xdagarwynn at gmail.com
>> <mailto:xdagarwynn at gmail.com>> wrote:
>>
>> Hello all,
>>
>> I've got a Powercolor Red Devil Vega 56 here that I'm trying to
>> get working in eGPU mode.
>> I think on the BIOS/hardware side it's now all fleshed out.
>> Now I'm at a point where amdgpu tries to init and reaches a fatal
>> error.
>>
>> Set loglevel=8 doesn't get any additional messages.
>> Here's what it does report (full dmesg attached):
>>
>> [ 429.005909] [drm] amdgpu kernel modesetting enabled.
>> [ 429.006080] [drm] initializing kernel modesetting (VEGA10
>> 0x1002:0x687F 0x148C:0x2388 0xC3).
>> [ 429.006082] amdgpu 0000:16:00.0: Fatal error during GPU init
>> [ 429.006155] amdgpu: probe of 0000:16:00.0 failed with error -12
>>
>> Using the following commands to unload & reload for testing.
>> Since it's as an eGPU I'm using the i7-7700K iGPU (i915 module)
>> as the primary and these commands work in terminal without
>> requiring a reboot.
>>
>> sudo rmmod amdgpu
>> sudo modprobe -v amgpu
>>
>> Pulled the UMR and tried to make, fails on Cmake. I'll attach log
>> in a text.
>> Also will attach a full dmesg and lspci dump. uname -a below:
>> /Linux testbox 4.15.15-041515-generic #201803311331 SMP Sat Mar
>> 31 17:34:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux/
>>
>> Any other ideas on how I can debug this further? Feel I'm so
>> close, don't want to let this go.
>> Thank you in advance for your time.
>>
>> Respectfully,
>> Daniel S. Moran (garwynn)
>> PC Hardware Editor - XDA-Developers
>> Phone: 1-559-316-0760/+81-90-5484-4155
>> Article Links: http://www.xda-developers.com/author/garwynn
>> <http://www.xda-developers.com/author/garwynn>
>> E-mail: xdagarwynn at gmail.com <mailto:xdagarwynn at gmail.com> |
>> Twitter: @xdagarwynn
>>
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180409/64034484/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2018-04-07 16-08-59.png
Type: image/png
Size: 60529 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180409/64034484/attachment-0001.png>
More information about the amd-gfx
mailing list