[GIT PULL] Networking for v6.16-rc6 (follow up)

Linus Torvalds torvalds at linux-foundation.org
Fri Jul 11 19:18:01 UTC 2025


On Fri, 11 Jul 2025 at 11:54, Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> Will do more testing.

Bah. What I thought was a "reliable hang" isn't actually that at all.
It ends up still being very random indeed.

That said, I do think it's related to this netlink issue, because the
symptoms end up being random delays.

I've seen it at boot before even logging in (I saw that twice in a row
after the latest networking pull, which is why I thought it was
reliable).

But the much more common situation is that some random gnome app ends
up hanging and then timing out.

Sometimes it's gnome-shell itself, so when I log in nothing happens,
and then after a 30s timeout gnome-shell times out and I get back the
login window.

That was what I *thought* was the common failure case, but it turns
out that I've now several times seen just random other applications
having that issue. This boot, for example, things "worked", except
starting gnome-terminal took a long time, and then I get a random
crash report for gsd-screensaver-proxy.

The backtrace for that was

  g_bus_get_sync ->
    initable_init ->
      g_data_input_stream_read_line ->
        g_buffered_input_stream_fill ->
          g_buffered_input_stream_real_fill ->
            g_input_stream_read ->
              g_socket_receive_with_timeout ->
                g_socket_condition_timed_wait ->
                  poll ->
                    __syscall_cancel

and I suspect these are all symptoms of the same thing.

My *guess* is that all of these things use a netlink socket, and
presumably it's the *other* end of the socket has filled up its
receive queue and is dropping packets as a result, and never
answering, so then - entirely randomly - depending on how overworked
things got, and which requests got dropped, some poor gnome process
never gets a reply and times out and the thing fails.

And sometimes the things that fail are not very critical (like some
gsd-screensaver-proxy) and I can log in happily. And sometimes they
are rather more critical and nothing works.

Anyway, because it's so damn random, it's neither bisectable nor easy
to know when something is "fixed".

I spent several hours yesterday chasing all the wrong things (because
I thought it was in drm), and often thought "Oh, that fixed it". Only
to then realize that nope, the problem still happens.

I will test the reverts. Several times.

             Linus


More information about the dri-devel mailing list