[PATCH] drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()

Wentland, Harry Harry.Wentland at amd.com
Mon Nov 5 15:30:21 UTC 2018


On 2018-11-01 9:51 p.m., Lyude Paul wrote:
> [why]
> Removing connector reusage from DM to match the rest of the tree ended
> up revealing an issue that was surprisingly subtle. The original amdgpu
> code for DC that was submitted appears to have left a chunk in
> dm_dp_create_fake_mst_encoder() that tries to find a "master encoder",
> the likes of which isn't actually used or stored anywhere. It does so at
> the wrong time as well by trying to access parts of the drm_connector
> from the encoder init before it's actually been initialized. This
> results in a NULL pointer deref on MST hotplugs:
> 
> [  160.696613] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> [  160.697234] PGD 0 P4D 0
> [  160.697814] Oops: 0010 [#1] SMP PTI
> [  160.698430] CPU: 2 PID: 64 Comm: kworker/2:1 Kdump: loaded Tainted: G           O      4.19.0Lyude-Test+ #2
> [  160.699020] Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.22 05/17/2018
> [  160.699672] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
> [  160.700322] RIP: 0010:          (null)
> [  160.700920] Code: Bad RIP value.
> [  160.701541] RSP: 0018:ffffc9000029fc78 EFLAGS: 00010206
> [  160.702183] RAX: 0000000000000000 RBX: ffff8804440ed468 RCX: ffff8804440e9158
> [  160.702778] RDX: 0000000000000000 RSI: ffff8804556c5700 RDI: ffff8804440ed000
> [  160.703408] RBP: ffff880458e21800 R08: 0000000000000002 R09: 000000005fca0a25
> [  160.704002] R10: ffff88045a077a3d R11: ffff88045a077a3c R12: ffff8804440ed000
> [  160.704614] R13: ffff880458e21800 R14: ffff8804440e9000 R15: ffff8804440e9000
> [  160.705260] FS:  0000000000000000(0000) GS:ffff88045f280000(0000) knlGS:0000000000000000
> [  160.705854] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  160.706478] CR2: ffffffffffffffd6 CR3: 000000000200a001 CR4: 00000000003606e0
> [  160.707124] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  160.707724] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  160.708372] Call Trace:
> [  160.708998]  ? dm_dp_add_mst_connector+0xed/0x1d0 [amdgpu]
> [  160.709625]  ? drm_dp_add_port+0x2fa/0x470 [drm_kms_helper]
> [  160.710284]  ? wake_up_q+0x54/0x70
> [  160.710877]  ? __mutex_unlock_slowpath.isra.18+0xb3/0x110
> [  160.711512]  ? drm_dp_dpcd_access+0xe7/0x110 [drm_kms_helper]
> [  160.712161]  ? drm_dp_send_link_address+0x155/0x1e0 [drm_kms_helper]
> [  160.712762]  ? drm_dp_check_and_send_link_address+0xa3/0xd0 [drm_kms_helper]
> [  160.713408]  ? drm_dp_mst_link_probe_work+0x4b/0x80 [drm_kms_helper]
> [  160.714013]  ? process_one_work+0x1a1/0x3a0
> [  160.714667]  ? worker_thread+0x30/0x380
> [  160.715326]  ? wq_update_unbound_numa+0x10/0x10
> [  160.715939]  ? kthread+0x112/0x130
> [  160.716591]  ? kthread_create_worker_on_cpu+0x70/0x70
> [  160.717262]  ? ret_from_fork+0x35/0x40
> [  160.717886] Modules linked in: amdgpu(O) vfat fat snd_hda_codec_generic joydev i915 chash gpu_sched ttm i2c_algo_bit drm_kms_helper snd_hda_codec_hdmi hp_wmi syscopyarea iTCO_wdt sysfillrect sparse_keymap sysimgblt fb_sys_fops snd_hda_intel usbhid wmi_bmof drm snd_hda_codec btusb snd_hda_core intel_rapl btrtl x86_pkg_temp_thermal btbcm btintel coretemp snd_pcm crc32_pclmul bluetooth psmouse snd_timer snd pcspkr i2c_i801 mei_me i2c_core soundcore mei tpm_tis wmi tpm_tis_core hp_accel ecdh_generic lis3lv02d tpm video rfkill acpi_pad input_polldev hp_wireless pcc_cpufreq crc32c_intel serio_raw tg3 xhci_pci xhci_hcd [last unloaded: amdgpu]
> [  160.720141] CR2: 0000000000000000
> 
> Somehow the connector reusage DM was using for MST connectors managed to
> paper over this issue entirely; hence why this was never caught until
> now.
> 
> [how]
> Since this code isn't used anywhere and seems useless anyway, we can
> just drop it entirely. This appears to fix the issue on my HP ZBook with
> an AMD WX4150.
> 
> Signed-off-by: Lyude Paul <lyude at redhat.com>

Good catch. Probably got broken with some of the best_encoder cleanup that happened in the last few months. I really should've caught it then.

Reviewed-by: Harry Wentland <harry.wentland at amd.com>

Harry

> ---
> Hey! This is the patch that I was talking about, feel free to review
> it-we should make sure this goes in with the rest of the patches you've
> got so far.
> 
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 6aa7359d61a3..5432a1862b41 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -285,12 +285,7 @@ dm_dp_create_fake_mst_encoder(struct amdgpu_dm_connector *connector)
>  	struct amdgpu_device *adev = dev->dev_private;
>  	struct amdgpu_encoder *amdgpu_encoder;
>  	struct drm_encoder *encoder;
> -	const struct drm_connector_helper_funcs *connector_funcs =
> -		connector->base.helper_private;
> -	struct drm_encoder *enc_master =
> -		connector_funcs->best_encoder(&connector->base);
>  
> -	DRM_DEBUG_KMS("enc master is %p\n", enc_master);
>  	amdgpu_encoder = kzalloc(sizeof(*amdgpu_encoder), GFP_KERNEL);
>  	if (!amdgpu_encoder)
>  		return NULL;
> 


More information about the amd-gfx mailing list