GPU passthrough support for Stoney [Radeon R2/R3/R4/R5 Graphics]?

Micah Morton mortonm at chromium.org
Thu May 23 19:18:21 UTC 2019


On Thu, May 23, 2019 at 9:55 AM Micah Morton <mortonm at chromium.org> wrote:
>
> On Wed, May 22, 2019 at 6:11 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> >
> > On Wed, May 22, 2019 at 7:00 PM Micah Morton <mortonm at chromium.org> wrote:
> > >
> > > On Wed, May 22, 2019 at 1:39 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > >
> > > > On Tue, May 21, 2019 at 1:46 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > >
> > > > > On Fri, May 17, 2019 at 9:59 AM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > >
> > > > > > On Fri, May 17, 2019 at 11:36 AM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > >
> > > > > > > On Thu, May 16, 2019 at 1:39 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, May 16, 2019 at 4:07 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, May 15, 2019 at 7:19 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, May 15, 2019 at 2:26 PM Micah Morton <mortonm at chromium.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi folks,
> > > > > > > > > > >
> > > > > > > > > > > I'm interested in running a VM on a system with an integrated Stoney
> > > > > > > > > > > [Radeon R2/R3/R4/R5 Graphics] card and passing through the graphics
> > > > > > > > > > > card to the VM using the IOMMU. I'm wondering whether this is feasible
> > > > > > > > > > > and supposed to be doable with the right setup (as opposed to passing
> > > > > > > > > > > a discrete GPU to the VM, which I think is definitely doable?).
> > > > > > > > > > >
> > > > > > > > > > > So far, I can do all the qemu/kvm/vfio/iommu stuff to run the VM and
> > > > > > > > > > > pass the integrated GPU to it, but the drm driver in the VM fails
> > > > > > > > > > > during amdgpu_device_init(). Specifically, the logs show the SMU being
> > > > > > > > > > > unresponsive, which leads to a 'SMU firmware load failed' error
> > > > > > > > > > > message and kernel panic. I can share VM logs and the invocation of
> > > > > > > > > > > qemu and such if helpful, but first wanted to know at a high level if
> > > > > > > > > > > this should be feasible?
> > > > > > > > > > >
> > > > > > > > > > > P.S.: I'm not initializing the GPU in the host bios or host kernel at
> > > > > > > > > > > all, so I should be passing a fresh GPU to the VM. Also, I'm pretty
> > > > > > > > > > > sure I'm running the correct VGA bios for this GPU in the guest VM
> > > > > > > > > > > bios before guest boot.
> > > > > > > > > > >
> > > > > > > > > > > Any comments/suggestions would be appreciated!
> > > > > > > > > >
> > > > > > > > > > It should work in at least once as long as your vm is properly set up.
> > > > > > > > >
> > > > > > > > > Is there any reason running coreboot vs UEFI at host boot would make a
> > > > > > > > > difference? I was running a modified version of coreboot that avoids
> > > > > > > > > doing any GPU initialization in firmware -- so the first POST happens
> > > > > > > > > inside the guest.
> > > > > > > >
> > > > > > > > The GPU on APUs shares a bunch of resources with the CPU.  There are a
> > > > > > > > bunch of blocks which are shared and need to be initialized on both
> > > > > > > > for everything to work properly.
> > > > > > >
> > > > > > > Interesting. So skipping running the vbios in the host and waiting
> > > > > > > until running it for the first time in the guest SeaBIOS is a bad
> > > > > > > idea? Would it be better to let APU+CPU initialize normally in the
> > > > > > > host and then skip trying to run the vbios in guest SeaBIOS and just
> > > > > > > do some kind of reset before the drm driver starts accessing it from
> > > > > > > the guest?
> > > > > >
> > > > > > If you let the sbios initialize things, it should work.  The driver
> > > > > > will do the right thing to init the card when it loads whether its
> > > > > > running on bare metal or in a VM.  We've never tested any scenarios
> > > > > > where the GPU on APUs is not handled by the sbios.  Note that the GPU
> > > > > > does not have to be posted per se, it just needs to have been properly
> > > > > > taken into account when the sbios comes up so that shared components
> > > > > > are initialized correctly.  I don't know what your patched system does
> > > > > > or doesn't do with respect to the platform initialization.
> > > > >
> > > > > So it sounds like you are suggesting the following:
> > > > >
> > > > > a) Run the vbios as part of the host sbios to initialize stuff and
> > > > > patch up the vbios
> > > > >
> > > > > b) Don't run the drm/amdgpu driver in the host, wait until the guest for that
> > > > >
> > > > > c) Avoid running the vbios (again) in the guest sbios since it has
> > > > > already been run. Although since "the driver needs access to the vbios
> > > > > image in the guest to get device specific configuration details" I
> > > > > should still do '-device
> > > > > vfio-pci,...,romfile=/path/to/vbios-that-was-fixed-up-by-host-sbios',
> > > > > but should patch the guest sbios to avoid running the vbios again in
> > > > > the guest?
> > > > >
> > > > > d) run the drm/amdgpu driver in the guest on the hardware that was
> > > > > initialized in the host sbios
> > > > >
> > > > > Am I getting this right?
> > > > >
> > > >
> > > > I would suggest starting with a standard sbios/vbios.  Boot the system
> > > > as usual.  You can blacklist amdgpu if you don't want gpu driver
> > > > loaded on the host.  Get a copy of the vbios image.  Depending on the
> > > > platform, it may be available via the standard vbios shadow location
> > > > or it may be at the start of carveout or it may be available via an
> > > > ACPI interface.  The driver has the logic to fetch it.  You can dump a
> > > > copy from the driver via /sys/kernel/debug/dri/X/amdgpu_vbios where X
> > > > is the number of the dri device of the card in question.  Then start
> > >
> > > I've been able to retrieve the vbios from
> > > /sys/kernel/debug/dri/X/amdgpu_vbios as well as
> > > /sys/devices/pciXXXX:XX/XXXX:XX:XX.X/rom (although they do differ in
> > > 46 different byte locations, not sure if this is expected). In both
> > > cases, passing the vbios through to the VM results in SeaBIOS hanging
> > > while executing the vbios:
> > >
> > > Found 2 cpu(s) max supported 2 cpu(s)
> > > Copying PIR from 0x7ffbfc60 to 0x000f5e00
> > > Copying MPTABLE from 0x00006e60/7ffa34c0 to 0x000f5cf0
> > > Copying SMBIOS entry point from 0x00006e60 to 0x000f5b10
> > > Scan for VGA option rom
> > > Running option rom at c000:0003
> > > HANGS HERE INDEFINITELY <--
> >
> > Sorry, misunderstood what you were asking.  There is no need to
> > execute the vbios in the VM.  The driver just needs access to it.
>
> Ah ok makes sense. Unfortunately if I patch SeaBIOS to not run the
> vbios in the VM (and thus avoid hanging) then amdgpu fails with these
> errors:
>
> [    0.827814] [drm] BIOS signature incorrect 0 0
> [    0.835529] amdgpu 0000:00:02.0: Invalid PCI ROM header signature:
> expecting 0xaa55, got 0x0000
> [    0.858874] [drm] BIOS signature incorrect 0 0
> [    0.863678] [drm:amdgpu_get_bios] *ERROR* Unable to locate a BIOS ROM
> [    0.873165] amdgpu 0000:00:02.0: Fatal error during GPU init
>
> In order to avoid running the vbios in SeaBIOS I've been commenting
> out this function:
> https://github.com/coreboot/seabios/blob/642db1905ab133007d7427b6758a2103fb09a19a/src/post.c#L210.
> Is this how you envisioned not running the vbios in the VM? Seems like
> there's probably stuff in vgarom_setup() (here
> https://github.com/coreboot/seabios/blob/642db1905ab133007d7427b6758a2103fb09a19a/src/optionroms.c#L405)
> that needs to be run even if we don't want to actually execute the
> vbios. Either way, I don't imagine hacking on SeaBIOS is what you were
> suggesting, but not sure?

I can actually get the guest to boot with this workflow, if I just
comment out this line
(https://github.com/coreboot/seabios/blob/642db1905ab133007d7427b6758a2103fb09a19a/src/optionroms.c#L141)
rather than all of vgarom_setup(). In that case booting the guest
leads to more progress in the amdgpu driver (looks like it makes it to
smu8_start_smu() before failing):

[    2.040452] smu version 33.09.00
[    2.065729] random: fast init done
[    2.593909] ------------[ cut here ]------------
[    2.602202] smu8_send_msg_to_smc_with_parameter(0x0254, 0x0) timed
out after 548628 us

followed by

[    4.068870] SMU firmware load failed
[    4.648983] SMU check loaded firmware failed.

and a crash not too long after.

>
> >
> > >
> > > Which is why I was asking about not running the vbios in the guest
> > > SeaBIOS. Granted, the sbios I've been running in the host is Chrome OS
> > > coreboot firmware -- so probably not what you mean by "standard
> > > sbios"? You think there is a good chance running coreboot instead of
> > > the "standard sbios" is what is causing things to fail?
> > >
> >
> > It's not clear to me what exactly your environment is.  Are you
> > running a custom coreboot you built yourself or is this a stoney
> > chromebook?  My point was just to start with the sbios/vbios that
> > shipped with the system rather than some customized thing.
>
> Its a stoney chromebook "Acer Chromebook 315" (a.k.a "aleena") with
> "7th Generation AMD A6-9220C APU with Radeon™ R5 Graphics". I'm
> running the regular Chrome OS firmware with the exception that I've
> enabled IOMMUs (like here:
> https://chromium-review.googlesource.com/1261291).
>
> >
> > Alex
> >
> > > > VM with the device passed through.  Don't worry about running the
> > > > vbios or not in the host vs. the guest.  The driver will do the right
> > > > thing.
> > > >
> > > > Alex
> > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Note that the driver needs access to the vbios image in the guest to
> > > > > > > > > > get device specific configuration details (clocks, display connector
> > > > > > > > > > configuration, etc.).
> > > > > > > > >
> > > > > > > > > Is there anything I need to do to ensure this besides passing '-device
> > > > > > > > > vfio-pci,...,romfile=/path/to/vgarom' to qemu?
> > > > > > > >
> > > > > > > > You need the actual vbios rom image from your system.  The image is
> > > > > > > > board specific.
> > > > > > >
> > > > > > > I should have the correct vbios rom image for my board. I'm extracting
> > > > > > > it from the firmware image (that works for regular graphics init
> > > > > > > without this VM stuff) for the board at build time (rather than
> > > > > > > grabbing it from /sys/devices/pci... at runtime), so it shouldn't be
> > > > > > > modified or corrupted in any way.
> > > > > >
> > > > > > The vbios image is patched at boot time by the sbios image for run
> > > > > > time configuration stuff.  For example, some of the pcie lanes are
> > > > > > shared with display lanes and can be used for either display or pcie
> > > > > > add in cards.  The sbios determines this at boot and patches the vbios
> > > > > > display tables so the driver knows that the displays are not
> > > > > > available.  Also things like flat panels on laptops.  OEMs may have
> > > > > > several different flat panel models they use with a particular
> > > > > > platform and the sbios patches the vbios display tables with the
> > > > > > proper parameters for the panel in use.  The sbios also patches tables
> > > > > > related to bandwidth.  E.g., the type and speed and number of channels
> > > > > > of the system ram so that the GPU driver can set proper limits on
> > > > > > things like display modes.  So you need to use the vbios image that is
> > > > > > provided by the sbios at boot.
> > > > >
> > > > > Ok yeah good point. I noticed that there are a few hundred byte
> > > > > locations in the vbios that are different between the pre-boot version
> > > > > from build time and the version dumped from sysfs at runtime, so
> > > > > something must be getting patched up.
> > > > >
> > > > > >
> > > > > > Alex


More information about the amd-gfx mailing list