[Intel-gfx] [PATCH v2 00/22] Enable gpu switching on the MacBook Pro

Wed Aug 12 23:50:47 PDT 2015

On Thu, Aug 13, 2015 at 1:37 AM, Lukas Wunner <lukas at wunner.de> wrote:
> Hi Daniel,
>
> On Wed, Aug 12, 2015 at 04:16:25PM +0200, Daniel Vetter wrote:
>> > * Reprobing if the inactive GPU initializes before the apple-gmux module:
>> >   v1 used Matthew Garrett's approach of adding a driver callback.
>> >   v2 simply generates a hotplug event instead. nouveau polls its outputs
>> >   every 10 seconds so we want it to poll immediately once apple-gmux
>> >   registers. That is achieved by the hotplug event. The i915 driver is
>> >   changed to behave identically to nouveau. (Right now it deletes LVDS
>> >   and eDP connectors from the mode configuration if they can't be probed,
>> >   deeming them to be ghosts.)
>>
>> I thought -EDEFERREDPROBE is what we should be using if drivers don't get
>> loaded in the right order? Hand-rolling depency avoidance stuff is imo a
>> horrible idea.
> [...]
>> I think just reading edid and the relevant dp aux data in apple-gmux or
>> somewhere like that and stalling driver load until that's ready is the
>> only clean option.
>
> I'm afraid we can't stall initialization of a driver like that because
> even though the GPU may not be switched to the panel, it may still have
> an external monitor attached.
>
> All MacBook Pros have external DP and/or HDMI ports and these are
> either soldered to the discrete GPU (model year 2011 and onwards)
> or switchable between the discrete and integrated GPU (until 2010;
> I think they are even switchable separately from the panel).
>
> So basically we'd have to initialize the driver normally, and if
> intel_lvds_init() or intel_edp_init_connector() fail we'd have to
> somehow pass that up the call chain so that i915_pci_probe() can
> return -EPROBE_DEFER. And whenever we're asked to reprobe we'd
> repeat initialization of the LVDS or eDP connector.
>
> I'm wondering what the benefit is compared to just keeping the
> connector in the mode configuration, but with status disconnected,
> and reprobing it when the ->output_poll_changed callback gets invoked?
> Because that's what nouveau already does, and what I've changed i915
> to do with patch 13.
>
> vga_switcheroo calls drm_kms_helper_hotplug_event() when the handler
> registers (patch 11), which will invoke ->output_poll_changed.
> So we're talking about the Official DRM Callback [tm] to probe
> outputs, not "hand-rolling depency avoidance". :-)

Oh I didn't spot that one. This kind of layering inversions generally
leads to deadlocks and fun stuff. Also reprobing lvds/edp is just a
side-effect when you have fbdev emulation enabled. If we go with this
re-probing approach then we definitely need a new hook in
vga-switcheroo, and even then there's still the locking problem.

>> > * Framebuffer recreation if the inactive GPU initializes before the
>> >   apple-gmux module (i.e. discarding the default 1024x768 fb and replacing
>> >   with a new one with the actual panel resolution): v1 only supported this
>> >   for i915, v2 has a generic solution which works with nouveau and radeon
>> >   as well. (Necessary if the discrete GPU is forced to be the inactive one
>> >   on boot via the EFI variable.)
>>
>> Would completely remove the need for this ;-)
>
> Unfortunately not: We'd still have to initialize the driver to be able
> to drive external displays. If there are initially no connectors with
> modes, we'll once again end up with the 1024x768 fb.

EDEFERREDPROBE isn't something that gets returned to userspace, it's
just internal handling so that the kernel knows there's a depency
issue and it needs to retry probing once other drivers have finished
loading. It is the generic means linux has to handle cross-driver
depencies which aren't reflected in the bus hierarchy.

I.e. it's just something to make sure that apple-gmux is fully loaded
before i915/nouveau. The driver _will_ be initialized eventually.

>> You can't share the dp aux like that. It's true that we need a bit more
>> data (there's a few eDP related feature blocsk we need), but sharing the
>> aux channel entirely is no-go. If you do you get drivers trying to link
>> train and at best this fails and at worst you'll upset the configuration
>> of the other driver and piss of the panel enough to need a hard reset
>> until it works again.
>
> Yes, so far proxying of the AUX channel is read-only. In those cases
> when writing is necessary for setting up the output, I'm adding code
> to check if the current content of the DPCD is identical to what's
> being written and if so, skip the write. We'll see if this stategy is
> sufficient for the drivers to set up their outputs.

You need to block anything that would write _much_ earlier. By the
time we're doing link-training it's way too late.

>> I think the real tricky bit here with vgaswitcheroo is locking, I need to
>> take a separate lock at the patches for that.
>
> Locking when switching only the DDC lines is facilitated by the ddc_lock
> attribute of struct vgasr_priv. This is all local to vga_switcheroo.c
> and contained in patches 5 and 6.
>
> Locking when proxying the AUX channel is facilitated by the hw_mutex
> attribute of struct drm_dp_aux. nouveau has its own locking mechanism
> contained in drivers/gpu/drm/nouveau/nvkm/subdev/i2c/pad*.c. Thus,
> when proxying via nouveau, there are two locking mechanisms at work
> (drm_dp_aux hw_mutex as outer lock + pad as inner lock). This is
> nothing introduced by this patch set, all existing code.
>
> Locking of access to the struct vgasr_priv is facilitated by the
> vgasr_mutex in vga_switcheroo.c. Also existing code.

Making sure everything is protected is the easy part of locking
review. Making sure you can't deadlock is the hard part, and the
mutex_trylock game looks like that's a problem not handled correctly.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch