How to handle disconnection of eDP panels due to dynamic display mux switches

Wed Apr 1 19:21:34 UTC 2020

On 4/1/20 1:46 AM, Daniel Vetter wrote
> On Wed, Apr 1, 2020 at 3:58 AM Daniel Dadap <ddadap at nvidia.com> wrote:
>> On 3/31/20 2:32 AM, Daniel Vetter wrote:
>>> Since I see no mention of this anywhere in your mail ... have you
>>> tried looking at drivers/gpu/vga/vga_switcheroo.c? This also supports
>>> switching of just outputs, not just the old optimus way of switching
>>> out the entire gpu and having to disable the other one.
>>
>> We did look into vga-switcheroo while developing our PoC, but found it
>> insufficient for supporting the use case we had in mind in its current
>> form. Among the limitations we observed were that it didn't seem to be
>> possible to switch with output-level granularity, only with gpu-level
>> granularity, with the whole switched-away-to GPU being powered down.
>> Your description suggests that this is indeed possible, but if that's
>> the case, then the mechanism for doing so isn't obvious in what I can
>> see of the API from the kernel source code, even in the drm-next tree.
>> Do you have pointers to documentation on how the per-output switching is
>> supposed to work?
> I think the per-output switch isn't support (yet), but otherwise there
> should be two modes:
> - power switch mode, other gpu has to be offlined. This also means
> switching needs to wait until all the clients are shut down.
> - output switching only. That seems to be roughly what you want,
> except you want to switch more than just the integrated panel. I think
> all models thus far simply wired all external outputs to the dgpu
> always.

Is there some tree with an early version of the output switching 
implemented that you can point to for testing?

> I primarily asked about vgaswitcheroo since you didn't mention it at all.

I had actually anticipated that vga-switcheroo would likely be 
suggested, and my first draft of my initial message had a lengthy 
explanation about why vga-switcheroo (in its current state) isn't 
suitable, but figured it would be better to simplify the problem 
statement and then discuss switcheroo's limitations if and when it came 
up in discussion.

>
>> Other limitations of vga-switcheroo included:
>>
>> * The can_switch() callbacks for all current vga-switcheroo-capable GPU
>> drivers don't seem to allow for a switch to occur while there are active
>> KMS clients. This has the effect of making it so that non-deferred
>> switches can only be initiated under the same circumstances that
>> deferred switches wait for.
>> * It's only possible to address one mux device. Some systems have
>> separate muxes for internal and external displays. From what I could see
>> in existing vga-switcheroo mux handlers, it seems that multi-muxed
>> systems just switch all of the muxes simultaneously, which makes sense
>> for the all-or-nothing "power down the GPU not in use" approach, but
>> might not be desirable for fine-grained output-level switching.
>> * On some systems, it's possible to put the panel into a self-refresh
>> mode before switching the mux and exit self-refresh mode after
>> switching, to hide any glitches that might otherwise appear while
>> transitioning. Additional handler callbacks for pre-switch and
>> post-switch operations would be useful.
>>
>> If vga-switcheroo could be updated to address these limitatons, then it
>> could make sense to handle the display disconnect/connect notifications
>> and display refreshing as part of a vga_switcheroo client driver's
>> set_gpu_state() callback (or a finer-grained per-output callback);
>> however, it also seems that it should be possible to implement APIs
>> within the DRM subsystem to accomplish the desired functionality
>> independently of current or future vga-switcheroo design.
> vgaswitcheroo was written by Dave Airlie, doesn't get much more
> "within the gpu subsystem". I think we should look into extending
> vgaswitcheroo instead of inventing a completely new uapi. If we go
> with a simplified use-case of
> - only integrated panel switches
> - external output on the dgpu
> then that should match existing systems, so you get support on at
> least some desktops for free. Plus no headaches about userspace we
> need for merging new uapi. Once we have that (for your mux, and I
> guess with i915+nouveau all working) then we can look into how to
> extend it, or give the same underlying stuff an interface that's not
> in debugfs :-)

If output-based switching is already in development, have you addressed 
the "permanently connected" assumption for eDP displays as part of this?

> Also for your seamless switching with self-refresh this could be
> implemented without new userspace, greatly simplifying everything.
>
> Cheers, Daniel
>
>>> There's some rough corners (like the uapi doesn't exist, it's in
>>> debugfs), and the locking has an inversion problem (I have ideas), but
>>> generally what you want exists already.
>>> -Daniel
>>>
>>> On Mon, Mar 30, 2020 at 9:12 AM Daniel Dadap <ddadap at nvidia.com> wrote:
>>>> A number of hybrid GPU notebook computer designs with dual (integrated
>>>> plus discrete) GPUs are equipped with multiplexers (muxes) that allow
>>>> display panels to be driven by either the integrated GPU or the discrete
>>>> GPU. Typically, this is a selection that can be made at boot time as a
>>>> menu option in the system firmware's setup screen, and the mux selection
>>>> stays fixed for as long as the system is running and persists across
>>>> reboots until it is explicitly changed. However, some muxed hybrid GPU
>>>> systems have dynamically switchable muxes which can be switched while
>>>> the system is running.
>>>>
>>>> NVIDIA is exploring the possibility of taking advantage of dynamically
>>>> switchable muxes to enhance the experience of using a hybrid GPU system.
>>>> For example, on a system configured for PRIME render offloading, it may
>>>> be possible to keep the discrete GPU powered down and use the integrated
>>>> GPU for rendering and displaying the desktop when no applications are
>>>> using the discrete GPU, and dynamically switch the panel to be driven
>>>> directly by the discrete GPU when render-offloading a fullscreen
>>>> application.
>>>>
>>>> We have been conducting some experiments on systems with dynamic muxes,
>>>> and have found some limitations that would need to be addressed in order
>>>> to support use cases like the one suggested above:
>>>>
>>>> * In at least the i915 DRM-KMS driver, and likely in other DRM-KMS
>>>> drivers as well, eDP panels are assumed to be always connected. This
>>>> assumption is broken when the panel is muxed away, which can cause
>>>> problems. A typical symptom is i915 repeatedly attempting to retrain the
>>>> link, severely impacting system performance and printing messages like
>>>> the following every five seconds or so:
>>>>
>>>> [drm:intel_dp_start_link_train [i915]] *ERROR* failed to enable link
>>>> training
>>>> [drm] Reducing the compressed framebuffer size. This may lead to less
>>>> power savings than a non-reduced-size. Try to increase stolen memory
>>>> size if available in BIOS.
>>>>
>>>> This symptom might occur if something causes the DRM-KMS driver to probe
>>>> the display while it's muxed away, for example a modeset or DPMS state
>>>> change.
>>>>
>>>> * When switching the mux back to a GPU that was previously driving a
>>>> mode, it is necessary to at the very least retrain DP links to restore
>>>> the previously displayed image. In a proof of concept I have been
>>>> experimenting with, I am able to accomplish this from userspace by
>>>> triggering DPMS off and then back on again; however, it would be good to
>>>> have an in-kernel API to request that an output owned by a DRM-KMS
>>>> driver be refreshed to resume driving a mode on a disconnected and
>>>> reconnected display. This API would need to be accessible from outside
>>>> of the DRM-KMS driver handling the output. One reason it would be good
>>>> to do this within the kernel, rather than rely on e.g. DPMS operations
>>>> in the xf86-video-modesetting driver, is that it would be useful for
>>>> restoring the console if X crashes or is forcefully killed while the mux
>>>> is switched to a GPU other than the one which drives the console.
>>>>
>>>> Basically, we'd like to be able to do the following:
>>>>
>>>> 1) Communicate to a DRM-KMS driver that an output is disconnected and
>>>> can't be used. Ideally, DRI clients such as X should still see the
>>>> output as being connected, so user applications don't need to keep track
>>>> of the change.
>>>> 2) Request that a mode that was previously driven on a disconnected
>>>> output be driven again upon reconnection.
>>>>
>>>> If APIs to do the above are already available, I wasn't able to find
>>>> information about them. These could be handled as separate APIs, e.g.,
>>>> one to set connected/disconnected state and another to restore an
>>>> output, or as a single API, e.g., signal a disconnect or reconnect,
>>>> leaving it up to the driver receiving the signal to set the appropriate
>>>> internal state and restore the reconnected output. Another possibility
>>>> would be an API to disable and enable individual outputs from outside of
>>>> the DRM-KMS driver that owns them. I'm curious to hear the thoughts of
>>>> the DRM subsystem maintainers and contributors on what the best approach
>>>> to this would be.
>>>>
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch