GPU passthrough support for Stoney [Radeon R2/R3/R4/R5 Graphics]?

Thu May 23 16:55:31 UTC 2019

On Wed, May 22, 2019 at 6:11 PM Alex Deucher <alexdeucher at gmail.com> wrote:
>
> On Wed, May 22, 2019 at 7:00 PM Micah Morton <mortonm at chromium.org> wrote:
> >
> > On Wed, May 22, 2019 at 1:39 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >
> > > On Tue, May 21, 2019 at 1:46 PM Micah Morton <mortonm at chromium.org> wrote:
> > > >
> > > > On Fri, May 17, 2019 at 9:59 AM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > >
> > > > > On Fri, May 17, 2019 at 11:36 AM Micah Morton <mortonm at chromium.org> wrote:
> > > > > >
> > > > > > On Thu, May 16, 2019 at 1:39 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, May 16, 2019 at 4:07 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > > >
> > > > > > > > On Wed, May 15, 2019 at 7:19 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, May 15, 2019 at 2:26 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi folks,
> > > > > > > > > >
> > > > > > > > > > I'm interested in running a VM on a system with an integrated Stoney
> > > > > > > > > > [Radeon R2/R3/R4/R5 Graphics] card and passing through the graphics
> > > > > > > > > > card to the VM using the IOMMU. I'm wondering whether this is feasible
> > > > > > > > > > and supposed to be doable with the right setup (as opposed to passing
> > > > > > > > > > a discrete GPU to the VM, which I think is definitely doable?).
> > > > > > > > > >
> > > > > > > > > > So far, I can do all the qemu/kvm/vfio/iommu stuff to run the VM and
> > > > > > > > > > pass the integrated GPU to it, but the drm driver in the VM fails
> > > > > > > > > > during amdgpu_device_init(). Specifically, the logs show the SMU being
> > > > > > > > > > unresponsive, which leads to a 'SMU firmware load failed' error
> > > > > > > > > > message and kernel panic. I can share VM logs and the invocation of
> > > > > > > > > > qemu and such if helpful, but first wanted to know at a high level if
> > > > > > > > > > this should be feasible?
> > > > > > > > > >
> > > > > > > > > > P.S.: I'm not initializing the GPU in the host bios or host kernel at
> > > > > > > > > > all, so I should be passing a fresh GPU to the VM. Also, I'm pretty
> > > > > > > > > > sure I'm running the correct VGA bios for this GPU in the guest VM
> > > > > > > > > > bios before guest boot.
> > > > > > > > > >
> > > > > > > > > > Any comments/suggestions would be appreciated!
> > > > > > > > >
> > > > > > > > > It should work in at least once as long as your vm is properly set up.
> > > > > > > >
> > > > > > > > Is there any reason running coreboot vs UEFI at host boot would make a
> > > > > > > > difference? I was running a modified version of coreboot that avoids
> > > > > > > > doing any GPU initialization in firmware -- so the first POST happens
> > > > > > > > inside the guest.
> > > > > > >
> > > > > > > The GPU on APUs shares a bunch of resources with the CPU.  There are a
> > > > > > > bunch of blocks which are shared and need to be initialized on both
> > > > > > > for everything to work properly.
> > > > > >
> > > > > > Interesting. So skipping running the vbios in the host and waiting
> > > > > > until running it for the first time in the guest SeaBIOS is a bad
> > > > > > idea? Would it be better to let APU+CPU initialize normally in the
> > > > > > host and then skip trying to run the vbios in guest SeaBIOS and just
> > > > > > do some kind of reset before the drm driver starts accessing it from
> > > > > > the guest?
> > > > >
> > > > > If you let the sbios initialize things, it should work.  The driver
> > > > > will do the right thing to init the card when it loads whether its
> > > > > running on bare metal or in a VM.  We've never tested any scenarios
> > > > > where the GPU on APUs is not handled by the sbios.  Note that the GPU
> > > > > does not have to be posted per se, it just needs to have been properly
> > > > > taken into account when the sbios comes up so that shared components
> > > > > are initialized correctly.  I don't know what your patched system does
> > > > > or doesn't do with respect to the platform initialization.
> > > >
> > > > So it sounds like you are suggesting the following:
> > > >
> > > > a) Run the vbios as part of the host sbios to initialize stuff and
> > > > patch up the vbios
> > > >
> > > > b) Don't run the drm/amdgpu driver in the host, wait until the guest for that
> > > >
> > > > c) Avoid running the vbios (again) in the guest sbios since it has
> > > > already been run. Although since "the driver needs access to the vbios
> > > > image in the guest to get device specific configuration details" I
> > > > should still do '-device
> > > > vfio-pci,...,romfile=/path/to/vbios-that-was-fixed-up-by-host-sbios',
> > > > but should patch the guest sbios to avoid running the vbios again in
> > > > the guest?
> > > >
> > > > d) run the drm/amdgpu driver in the guest on the hardware that was
> > > > initialized in the host sbios
> > > >
> > > > Am I getting this right?
> > > >
> > >
> > > I would suggest starting with a standard sbios/vbios.  Boot the system
> > > as usual.  You can blacklist amdgpu if you don't want gpu driver
> > > loaded on the host.  Get a copy of the vbios image.  Depending on the
> > > platform, it may be available via the standard vbios shadow location
> > > or it may be at the start of carveout or it may be available via an
> > > ACPI interface.  The driver has the logic to fetch it.  You can dump a
> > > copy from the driver via /sys/kernel/debug/dri/X/amdgpu_vbios where X
> > > is the number of the dri device of the card in question.  Then start
> >
> > I've been able to retrieve the vbios from
> > /sys/kernel/debug/dri/X/amdgpu_vbios as well as
> > /sys/devices/pciXXXX:XX/XXXX:XX:XX.X/rom (although they do differ in
> > 46 different byte locations, not sure if this is expected). In both
> > cases, passing the vbios through to the VM results in SeaBIOS hanging
> > while executing the vbios:
> >
> > Found 2 cpu(s) max supported 2 cpu(s)
> > Copying PIR from 0x7ffbfc60 to 0x000f5e00
> > Copying MPTABLE from 0x00006e60/7ffa34c0 to 0x000f5cf0
> > Copying SMBIOS entry point from 0x00006e60 to 0x000f5b10
> > Scan for VGA option rom
> > Running option rom at c000:0003
> > HANGS HERE INDEFINITELY <--
>
> Sorry, misunderstood what you were asking.  There is no need to
> execute the vbios in the VM.  The driver just needs access to it.

Ah ok makes sense. Unfortunately if I patch SeaBIOS to not run the
vbios in the VM (and thus avoid hanging) then amdgpu fails with these
errors:

[    0.827814] [drm] BIOS signature incorrect 0 0
[    0.835529] amdgpu 0000:00:02.0: Invalid PCI ROM header signature:
expecting 0xaa55, got 0x0000
[    0.858874] [drm] BIOS signature incorrect 0 0
[    0.863678] [drm:amdgpu_get_bios] *ERROR* Unable to locate a BIOS ROM
[    0.873165] amdgpu 0000:00:02.0: Fatal error during GPU init

In order to avoid running the vbios in SeaBIOS I've been commenting
out this function:
https://github.com/coreboot/seabios/blob/642db1905ab133007d7427b6758a2103fb09a19a/src/post.c#L210.
Is this how you envisioned not running the vbios in the VM? Seems like
there's probably stuff in vgarom_setup() (here
https://github.com/coreboot/seabios/blob/642db1905ab133007d7427b6758a2103fb09a19a/src/optionroms.c#L405)
that needs to be run even if we don't want to actually execute the
vbios. Either way, I don't imagine hacking on SeaBIOS is what you were
suggesting, but not sure?

>
> >
> > Which is why I was asking about not running the vbios in the guest
> > SeaBIOS. Granted, the sbios I've been running in the host is Chrome OS
> > coreboot firmware -- so probably not what you mean by "standard
> > sbios"? You think there is a good chance running coreboot instead of
> > the "standard sbios" is what is causing things to fail?
> >
>
> It's not clear to me what exactly your environment is.  Are you
> running a custom coreboot you built yourself or is this a stoney
> chromebook?  My point was just to start with the sbios/vbios that
> shipped with the system rather than some customized thing.

Its a stoney chromebook "Acer Chromebook 315" (a.k.a "aleena") with
"7th Generation AMD A6-9220C APU with Radeon™ R5 Graphics". I'm
running the regular Chrome OS firmware with the exception that I've
enabled IOMMUs (like here:
https://chromium-review.googlesource.com/1261291).

>
> Alex
>
> > > VM with the device passed through.  Don't worry about running the
> > > vbios or not in the host vs. the guest.  The driver will do the right
> > > thing.
> > >
> > > Alex
> > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Note that the driver needs access to the vbios image in the guest to
> > > > > > > > > get device specific configuration details (clocks, display connector
> > > > > > > > > configuration, etc.).
> > > > > > > >
> > > > > > > > Is there anything I need to do to ensure this besides passing '-device
> > > > > > > > vfio-pci,...,romfile=/path/to/vgarom' to qemu?
> > > > > > >
> > > > > > > You need the actual vbios rom image from your system.  The image is
> > > > > > > board specific.
> > > > > >
> > > > > > I should have the correct vbios rom image for my board. I'm extracting
> > > > > > it from the firmware image (that works for regular graphics init
> > > > > > without this VM stuff) for the board at build time (rather than
> > > > > > grabbing it from /sys/devices/pci... at runtime), so it shouldn't be
> > > > > > modified or corrupted in any way.
> > > > >
> > > > > The vbios image is patched at boot time by the sbios image for run
> > > > > time configuration stuff.  For example, some of the pcie lanes are
> > > > > shared with display lanes and can be used for either display or pcie
> > > > > add in cards.  The sbios determines this at boot and patches the vbios
> > > > > display tables so the driver knows that the displays are not
> > > > > available.  Also things like flat panels on laptops.  OEMs may have
> > > > > several different flat panel models they use with a particular
> > > > > platform and the sbios patches the vbios display tables with the
> > > > > proper parameters for the panel in use.  The sbios also patches tables
> > > > > related to bandwidth.  E.g., the type and speed and number of channels
> > > > > of the system ram so that the GPU driver can set proper limits on
> > > > > things like display modes.  So you need to use the vbios image that is
> > > > > provided by the sbios at boot.
> > > >
> > > > Ok yeah good point. I noticed that there are a few hundred byte
> > > > locations in the vbios that are different between the pre-boot version
> > > > from build time and the version dumped from sysfs at runtime, so
> > > > something must be getting patched up.
> > > >
> > > > >
> > > > > Alex