Regression: DDC I2C Display Freezing for internal displays
Felix Richter
judge at felixrichter.tech
Tue Apr 22 19:44:13 UTC 2025
Hi,
it has been quite at while since I first started experiencing this
particular bug I am about to describe. Suffice it to say during my
Easter holiday I finally had the time to dig into it. It all started
with an update of linux LTS from 6.6 to 6.12.
I am a user of the sway tiling window manager and have written a small
utility to manage my display configuration across different setups. With
the added twist that I wrote some code to determine which monitor inputs
is currently in use using the monitor command interface. Anyway the
interesting detail here is that, starting with kernel 6.12 I started
running into the following problem. With my display management daemon
running and attaching my Laptop to an external display my internal
display would just freeze with no way to bring it back apart from power
cycling the entire device. When my management daemon was not running
this would not happen, I would then need to manually configure my
display setup. Further investigation into the what is triggering the
display freeze lead me into the part of the code where I am enumerating
attached displays and am trying to match `i2c` devices to their
corresponding display.
To get more specific the procedure is as follows, using udev enumerate
all `i2c` busses and filter them base on some heuristics like device
name and devices with parent devices drm / graphics device. Sadly this
is not quite enough to already match an `i2c` command interface to the
corresponding monitor, in many cases it is required to manually read the
EDID information via the i2c interface and compare it to the known
attached displays to get the match. And this is where the trigger for
the display freeze is to be found.
Here is the output when scanning sysfs for my internal laptop display:
```
# ls -al
/sys/devices/pci0000:00/0000:00:08.1/0000:04:00.0/drm/card1/card1-eDP-1
total 0
drwxr-xr-x 6 root root 0 22. Apr 18:07 .
drwxr-xr-x 11 root root 0 22. Apr 18:07 ..
drwxr-xr-x 3 root root 0 22. Apr 18:07 amdgpu_bl1
-r--r--r-- 1 root root 4096 22. Apr 18:07 connector_id
lrwxrwxrwx 1 root root 0 22. Apr 18:07 ddc -> ../../../i2c-3
lrwxrwxrwx 1 root root 0 22. Apr 18:07 device -> ../../card1
-r--r--r-- 1 root root 4096 22. Apr 18:07 dpms
drwxr-xr-x 3 root root 0 22. Apr 18:07 drm_dp_aux0
-r--r--r-- 1 root root 0 22. Apr 18:07 edid
-r--r--r-- 1 root root 4096 22. Apr 18:07 enabled
drwxr-xr-x 4 root root 0 22. Apr 18:07 i2c-11
-r--r--r-- 1 root root 4096 22. Apr 18:07 modes
drwxr-xr-x 2 root root 0 22. Apr 18:07 power
-rw-r--r-- 1 root root 4096 22. Apr 18:07 status
lrwxrwxrwx 1 root root 0 22. Apr 18:07 subsystem ->
../../../../../../../class/drm
-rw-r--r-- 1 root root 4096 22. Apr 18:07 uevent
```
As can be seen there are two i2c devices present, i2c-3 (as ddc symlink)
and i2c-11. Now from the perspective of udev i2c-11 has the parent set
to card1-eDP-1 while i2c-3 has the parent set to the drm device itself.
More importantly I can not rule out i2c-3 as a valid command interface
because in some cases valid command channels are never assigned to the
corresponding display output directly but only live directly on the drm
device, this is especially true when monitors are not attached directly
but via a docking station. So I do have to look at each i2c device on
its own. The freeze is trigged by trying to read edid from i2c-3: This
is the code snipped I used to trigger the bug:
https://github.com/ju6ge/libmonitor/blob/918b2543eafb96aca29f66debc70fd18fa21ee11/examples/via-i2c-dev.rs
(adjusted target i2c interface accordingly). To be absolutely clear this
is not the i2c device that is expected to work in every case of trying
this with kernel 6.6 to 6.12 I get the following error message:
DdcError(CommunicationError(ReceiveError(EIO: I/O error))). That is
expected internal laptop displays do not support the command interface
in most cases anyway. But what I do not expect to happen is that my
Laptop screen freezes! And since this did not happen with kernel 6.6 but
started happening with 6.12 this seems to be a software issue and with
that a regression!
Next I bisected the kernel from 6.6 to 6.12 to determine when this
regression was introduced. I attached the full bisect log to the email ;)
The offending commit seems to be:
[58a261bfc96763a851cb48b203ed57da37e157b8] drm/amd/display: use a more
lax vblank enable policy for older ASICs
Since this is quite a small commit I validated this by reverting the
changes on a newer kernel version (patch attached as well). Testing
actually shows that reverting the change resolves the screen freezing
behavior for me.
Now I am not deep enough into graphics drivers to claim that just
reverting the commit should be considered a valid fix. Just that the
change is definitely responsible for the screen freezing now as apposed
to before.
So what should be done here? I can validate any other suggested fixes
against my setup or provide more information if need be.
Kind regards,
Felix Richter
#regzbot introduced: v6.6..v6.12
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bisect.log
Type: text/x-log
Size: 3284 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250422/84243cd2/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: revert-regression.patch
Type: text/x-patch
Size: 1069 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250422/84243cd2/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20250422/84243cd2/attachment-0001.sig>
More information about the amd-gfx
mailing list