[Intel-gfx] tg3 vs commit 2b3e88ea6528 ("net: phy: improve phy state checking")

Heiner Kallweit hkallweit1 at gmail.com
Wed Jan 9 19:38:10 UTC 2019


On 09.01.2019 14:11, Chris Wilson wrote:
> Hi, we've hit a snag with
> 
> commit 2b3e88ea65287ba738a798622405b15344871085
> Author: Heiner Kallweit <hkallweit1 at gmail.com>
> Date:   Sun Dec 16 18:30:14 2018 +0100
> 
>     net: phy: improve phy state checking
> 
> as it is detecting that the call from tg3 during suspend is from the
> HALTED state.
> 
> <4> [74.170194] ------------[ cut here ]------------
> <4> [74.170220] called from state HALTED
> <4> [74.170237] WARNING: CPU: 3 PID: 2473 at drivers/net/phy/phy.c:548 phy_start_aneg+0xf1/0x140
> <4> [74.170239] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic asix usbnet mii i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core broadcom bcm_phy_lib snd_pcm tg3 mei_me i2c_i801 mei prime_numbers lpc_ich
> <4> [74.170258] CPU: 3 PID: 2473 Comm: kworker/u16:22 Tainted: G     U            5.0.0-rc1-CI-CI_DRM_5366+ #1
> <4> [74.170260] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
> <4> [74.170264] Workqueue: events_unbound async_run_entry_fn
> <4> [74.170269] RIP: 0010:phy_start_aneg+0xf1/0x140
> <4> [74.170271] Code: 10 89 93 d0 04 00 00 0f b6 40 04 89 83 d4 04 00 00 e9 74 ff ff ff 48 8b 34 c5 20 a7 ed 81 48 c7 c7 a6 7c 11 82 e8 9f 94 9d ff <0f> 0b 41 bc f0 ff ff ff eb 91 80 a3 c0 04 00 00 7f eb a3 48 b8 ff
> <4> [74.170273] RSP: 0018:ffffc9000044bc80 EFLAGS: 00010282
> <4> [74.170276] RAX: 0000000000000000 RBX: ffff888216e38958 RCX: 0000000000000000
> <4> [74.170278] RDX: 0000000000000000 RSI: ffffffff8213044a RDI: 00000000ffffffff
> <4> [74.170280] RBP: ffff888216e38f00 R08: 00000000c14614ba R09: 0000000000000000
> <4> [74.170282] R10: ffffc9000044bc80 R11: 0000000000000000 R12: ffff888216e38958
> <4> [74.170284] R13: 0000000000000000 R14: ffff888223655f28 R15: ffff88821af6da58
> <4> [74.170287] FS:  0000000000000000(0000) GS:ffff888227ac0000(0000) knlGS:0000000000000000
> <4> [74.170289] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [74.170291] CR2: 000055f1dae80d80 CR3: 0000000002214006 CR4: 00000000000606e0
> <4> [74.170293] Call Trace:
> <4> [74.170301]  tg3_power_down_prepare+0x754/0xa30 [tg3]
> <4> [74.170308]  tg3_suspend+0x1e5/0x350 [tg3]
> <4> [74.170316]  pci_pm_suspend+0x6d/0x120
> <4> [74.170319]  ? pci_pm_freeze+0xc0/0xc0
> <4> [74.170324]  dpm_run_callback+0x64/0x280
> <4> [74.170330]  __device_suspend+0x12a/0x5b0
> <4> [74.170335]  ? dpm_watchdog_set+0x60/0x60
> <4> [74.170344]  async_suspend+0x15/0x90
> <4> [74.170347]  async_run_entry_fn+0x34/0x160
> <4> [74.170352]  process_one_work+0x245/0x610
> <4> [74.170360]  worker_thread+0x37/0x380
> <4> [74.170365]  ? process_one_work+0x610/0x610
> <4> [74.170368]  kthread+0x119/0x130
> <4> [74.170371]  ? kthread_park+0x80/0x80
> <4> [74.170377]  ret_from_fork+0x3a/0x50
> <4> [74.170388] irq event stamp: 648
> <4> [74.170392] hardirqs last  enabled at (647): [<ffffffff81123be2>] vprintk_emit+0x302/0x320
> <4> [74.170396] hardirqs last disabled at (648): [<ffffffff810019b0>] trace_hardirqs_off_thunk+0x1a/0x1c
> <4> [74.170399] softirqs last  enabled at (626): [<ffffffff81c0033a>] __do_softirq+0x33a/0x4b9
> <4> [74.170402] softirqs last disabled at (605): [<ffffffff81a00e1a>] do_softirq_own_stack+0x2a/0x40
> <4> [74.170405] WARNING: CPU: 3 PID: 2473 at drivers/net/phy/phy.c:548 phy_start_aneg+0xf1/0x140
> <4> [74.170407] ---[ end trace e382359f2ec4f600 ]---
> 
> Previously phy_start_aneg() would handle the HALTED state but now it is
> a warning. To maintain coverage in our (intel-gfx) CI, we've locally
> reverted 2b3e88ea6528.
> -Chris
> 
We added a few warnings to detect wrong usage of the phylib API.
This check seems to be a little bit too strict. I just submitted
a fix.

Still this helped to find an unneeded call to phy_start_aneg.
In tg3_phy_start() there's no need to call phy_start_aneg() after
phy_start(), this is done implicitly by phylib.

Heiner


More information about the Intel-gfx mailing list