<div dir="auto">I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci to the 3090 and not the 3070.<div dir="auto"><br></div><div dir="auto">I have bound the driver in two different ways, first by passing the IDs to the module and alternatively by manipulating the system interface and use the override (this is what I originally had to do when I used two 1080s, so I know it works).</div><div dir="auto"><br></div><div dir="auto">While the 3090 doesn't show a console, there's a remnant from the refund (and grub previously) there.<br><div dir="auto"><br></div><div dir="auto">The assessment Alex made previously, where aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB) instead of the device seems correct, but it could also can be a quirky of how EFIFB is implemented. I recall reading a long time ago that EFIFB is a special device and once it detects changes it would simply give up. There was also no way to attach a device to it again as it depends on being preloaded outside the kernel; once something takes over the buffer reinitializing is "impossible". I never went deeper to try and understand it.</div></div><br><br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <<a href="mailto:tzimmermann@suse.de">tzimmermann@suse.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi<br>
<br>
Am 05.12.22 um 01:51 schrieb Alex Williamson:<br>
> On Sat, 3 Dec 2022 17:12:38 -0700<br>
> "mb@lab.how" <mb@lab.how> wrote:<br>
> <br>
>> Hi,<br>
>><br>
>> I hope it is ok to reply to this old thread.<br>
> <br>
> It is, but the only relic of the thread is the subject. For reference,<br>
> the latest version of this posted is here:<br>
> <br>
> <a href="https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/" rel="noreferrer noreferrer" target="_blank">https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@suse.de/</a><br>
> <br>
> Which is committed as:<br>
> <br>
> d17378062079 ("vfio/pci: Remove console drivers")<br>
> <br>
>> Unfortunately, I found a<br>
>> problem only now after upgrading to 6.0.<br>
>><br>
>> My setup has multiple GPUs (2), and I depend on EFIFB to have a working console.<br>
<br>
Which GPUs do you have?<br>
<br>
>> pre-patch behavior, when I bind the vfio-pci to my secondary GPU both<br>
>> the passthrough and the EFIFB keep working fine.<br>
>> post-patch behavior, when I bind the vfio-pci to the secondary GPU,<br>
>> the EFIFB disappears from the system, binding the console to the<br>
>> "dummy console".<br>
<br>
The efifb would likely use the first GPU. And vfio-pci should only <br>
remove the generic driver from the second device. Are you sure that <br>
you're not somehow using the first GPU with vfio-pci.<br>
<br>
>> Whenever you try to access the terminal, you have the screen stuck in<br>
>> whatever was the last buffer content, which gives the impression of<br>
>> "freezing," but I can still type.<br>
>> Everything else works, including the passthrough.<br>
> <br>
> This sounds like the call to aperture_remove_conflicting_pci_devices()<br>
> is removing the conflicting driver itself rather than removing the<br>
> device from the driver. Is it not possible to unbind the GPU from<br>
> efifb before binding the GPU to vfio-pci to effectively nullify the<br>
> added call?<br>
> <br>
>> I can only think about a few options:<br>
>><br>
>> - Is there a way to have EFIFB show up again? After all it looks like<br>
>> the kernel has just abandoned it, but the buffer is still there. I<br>
>> can't find a single message about the secondary card and EFIFB in<br>
>> dmesg, but there's a message for the primary card and EFIFB.<br>
>> - Can we have a boolean controlling the behavior of vfio-pci<br>
>> altogether or at least controlling the behavior of vfio-pci for that<br>
>> specific ID? I know there's already some option for vfio-pci and VGA<br>
>> cards, would it be appropriate to attach this behavior to that option?<br>
> <br>
> I suppose we could have an opt-out module option on vfio-pci to skip<br>
> the above call, but clearly it would be better if things worked by<br>
> default. We cannot make full use of GPUs with vfio-pci if they're<br>
> still in use by host console drivers. The intention was certainly to<br>
> unbind the device from any low level drivers rather than disable use of<br>
> a console driver entirely. DRM/GPU folks, is that possibly an<br>
> interface we could implement? Thanks,<br>
<br>
When vfio-pci gives the GPU device to the guest, which driver driver is <br>
bound to it?<br>
<br>
Best regards<br>
Thomas<br>
<br>
> <br>
> Alex<br>
> <br>
<br>
-- <br>
Thomas Zimmermann<br>
Graphics Driver Developer<br>
SUSE Software Solutions Germany GmbH<br>
Maxfeldstr. 5, 90409 Nürnberg, Germany<br>
(HRB 36809, AG Nürnberg)<br>
Geschäftsführer: Ivo Totev<br><br>
</blockquote></div></div>