[Bug 101946] Rebinding AMDGPU causes initialization errors [R9 290 / 4.10 kernel]

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Jul 27 11:46:28 UTC 2017


https://bugs.freedesktop.org/show_bug.cgi?id=101946

            Bug ID: 101946
           Summary: Rebinding AMDGPU causes initialization errors [R9 290
                    / 4.10 kernel]
           Product: DRI
           Version: XOrg git
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: beanow at oscp.info

Created attachment 133068
  --> https://bugs.freedesktop.org/attachment.cgi?id=133068&action=edit
The script used to reproduce the error.

As I attempted to hotplug my R9 290 for a VM gaming setup, I stumbled on this
issue.

The main kern.log error to come up is:

> [  160.013733] [drm:ci_dpm_enable [amdgpu]] *ERROR* ci_start_dpm failed
> [  160.014134] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <amdgpu_powerplay> failed -22
> [  160.014531] amdgpu 0000:01:00.0: amdgpu_init failed


For my setup I use a Kaby Lake iGPU running i915.
With the R9 290 using vfio-pci / amdgpu.
Ubuntu 17.04 (4.10.0-28-generic).
Mesa 17.1.4 from the padoka stable PPA.


I'm able to reproduce this as follows.

1. Boot with vfio-pci capturing the card and amdgpu blacklisted. Kernel flags:
> intel_iommu=on iommu=pt vfio-pci.ids=1002:67b1,1002:aac8

2. Since I run Gnome3 on Ubuntu 17.04, this will bring me to a wayland greeter
which uses my iGPU. Drop to a free TTY, without logging in. This prevents Xorg
from responding to the AMD card becoming available.

3. Run the attached script "rebind-amd.sh" as root to bind back and forth
between vfio-pci and amdgpu in an infinite loop.

This will:

A. modprobe both drivers to be sure they're loaded.
B. Print information about the driver and card usage.
C. Use the new_id > unbind > bind > remove_id sequence to switch drivers.

What happens is:

vfio-pci -> vfio-pci, Gives no problems, of course.
vfio-pci -> amdgpu, This works and the amdgpu driver initializes the card.
Attached monitor(s) start searching for signals.
amdgpu -> vfio-pci, Since no Xorg is using the dGPU this works without
problems.
vfio-pci -> amdgpu, Fails to initialize dGPU with the kernel error above.


I've attached the script, the output of the script and the full kern.log.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20170727/11b5ba6b/attachment-0001.html>


More information about the dri-devel mailing list