[PATCH] drm/amdgpu: fix null pointer deref in firmware header printing
Kevin Wang
kevwa at amd.com
Thu Sep 5 10:26:45 UTC 2019
On 9/5/19 5:41 PM, Yuan, Xiaojie wrote:
> When CE's ucode_id(8) is used to get sdma_hdr, we will be accessing an
> unallocated amdgpu_firmware_info instance.
>
> This issue appears on rhel7.7 with gcc 4.8.5. Newer compilers might have
> optimized out such 'defined but not referenced' variable.
>
> [ 1120.798564] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
> [ 1120.806703] IP: [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
> [ 1120.813693] PGD 80000002603ff067 PUD 271b8d067 PMD 0
> [ 1120.818931] Oops: 0000 [#1] SMP
> [ 1120.822245] Modules linked in: amdgpu(OE+) amdkcl(OE) amd_iommu_v2 amdttm(OE) amd_sched(OE) xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun bridge stp llc devlink ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat iptable_mangle iptable_security iptable_raw nf_conntrack libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dm_mirror dm_region_hash dm_log dm_mod intel_pmc_core intel_powerclamp coretemp intel_rapl joydev kvm_intel eeepc_wmi asus_wmi kvm sparse_keymap iTCO_wdt irqbypass rfkill crc32_pclmul snd_hda_codec_realtek mxm_wmi ghash_clmulni_intel intel_wmi_thunderbolt iTCO_vendor_support snd_hda_codec_generic snd_hda_codec_hdmi aesni_intel lrw gf128mul glue_helper ablk_helper sg cryptd pcspkr snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd pinctrl_sunrisepoint pinctrl_intel soundcore acpi_pad mei_me wmi mei i2c_i801 pcc_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit iosf_mbi drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ahci libahci drm ptp libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw pps_core drm_panel_orientation_quirks video i2c_hid
> [ 1120.954136] CPU: 4 PID: 2426 Comm: modprobe Tainted: G OE ------------ 3.10.0-1062.el7.x86_64 #1
> [ 1120.964390] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 1302 11/09/2015
> [ 1120.973321] task: ffff991ef1e3c1c0 ti: ffff991ee625c000 task.ti: ffff991ee625c000
> [ 1120.981020] RIP: 0010:[<ffffffffc0e3c9b3>] [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
> [ 1120.990483] RSP: 0018:ffff991ee625f950 EFLAGS: 00010202
> [ 1120.995935] RAX: 0000000000000002 RBX: ffff991edf6b2d38 RCX: ffff991edf6a0000
> [ 1121.003391] RDX: 0000000000000000 RSI: ffff991f01d13898 RDI: ffffffffc110afb3
> [ 1121.010706] RBP: ffff991ee625f9b0 R08: 0000000000000000 R09: 0000000000000000
> [ 1121.018029] R10: 00000000000004c4 R11: ffff991ee625f64e R12: ffff991edf6b3220
> [ 1121.025353] R13: ffff991edf6a0000 R14: 0000000000000008 R15: ffff991edf6b2d30
> [ 1121.032666] FS: 00007f97b0c0b740(0000) GS:ffff991f01d00000(0000) knlGS:0000000000000000
> [ 1121.041000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1121.046880] CR2: 000000000000000a CR3: 000000025e604000 CR4: 00000000003607e0
> [ 1121.054239] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1121.061631] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1121.068938] Call Trace:
> [ 1121.071494] [<ffffffffc0e3dba8>] psp_hw_init+0x218/0x270 [amdgpu]
> [ 1121.077886] [<ffffffffc0da3188>] amdgpu_device_fw_loading+0xe8/0x160 [amdgpu]
> [ 1121.085296] [<ffffffffc0e3b34c>] ? vega10_ih_irq_init+0x4bc/0x730 [amdgpu]
> [ 1121.092534] [<ffffffffc0da5c75>] amdgpu_device_init+0x1495/0x1c90 [amdgpu]
> [ 1121.099675] [<ffffffffc0da9cab>] amdgpu_driver_load_kms+0x8b/0x2f0 [amdgpu]
> [ 1121.106888] [<ffffffffc01b25cf>] drm_dev_register+0x12f/0x1d0 [drm]
> [ 1121.113419] [<ffffffffa4dcdfd8>] ? pci_enable_device_flags+0xe8/0x140
> [ 1121.120183] [<ffffffffc0da260a>] amdgpu_pci_probe+0xca/0x170 [amdgpu]
> [ 1121.126919] [<ffffffffa4dcf97a>] local_pci_probe+0x4a/0xb0
> [ 1121.132622] [<ffffffffa4dd10c9>] pci_device_probe+0x109/0x160
> [ 1121.138607] [<ffffffffa4eb4205>] driver_probe_device+0xc5/0x3e0
> [ 1121.144766] [<ffffffffa4eb4603>] __driver_attach+0x93/0xa0
> [ 1121.150507] [<ffffffffa4eb4570>] ? __device_attach+0x50/0x50
> [ 1121.156422] [<ffffffffa4eb1da5>] bus_for_each_dev+0x75/0xc0
> [ 1121.162213] [<ffffffffa4eb3b7e>] driver_attach+0x1e/0x20
> [ 1121.167771] [<ffffffffa4eb3620>] bus_add_driver+0x200/0x2d0
> [ 1121.173590] [<ffffffffa4eb4c94>] driver_register+0x64/0xf0
> [ 1121.179345] [<ffffffffa4dd0905>] __pci_register_driver+0xa5/0xc0
> [ 1121.185593] [<ffffffffc099f000>] ? 0xffffffffc099efff
> [ 1121.190914] [<ffffffffc099f0a4>] amdgpu_init+0xa4/0xb0 [amdgpu]
> [ 1121.197101] [<ffffffffa4a0210a>] do_one_initcall+0xba/0x240
> [ 1121.202901] [<ffffffffa4b1c90a>] load_module+0x271a/0x2bb0
> [ 1121.208598] [<ffffffffa4dad740>] ? ddebug_proc_write+0x100/0x100
> [ 1121.214894] [<ffffffffa4b1ce8f>] SyS_init_module+0xef/0x140
> [ 1121.220698] [<ffffffffa518bede>] system_call_fastpath+0x25/0x2a
> [ 1121.226870] Code: b4 01 60 a2 00 00 31 c0 e8 83 60 33 e4 41 8b 47 08 48 8b 4d d0 48 c7 c7 b3 af 10 c1 48 69 c0 68 07 00 00 48 8b 84 01 60 a2 00 00 <48> 8b 70 08 31 c0 48 89 75 c8 e8 56 60 33 e4 48 8b 4d d0 48 c7
> [ 1121.247422] RIP [<ffffffffc0e3c9b3>] psp_np_fw_load+0x1e3/0x390 [amdgpu]
> [ 1121.254432] RSP <ffff991ee625f950>
> [ 1121.258017] CR2: 000000000000000a
> [ 1121.261427] ---[ end trace e98b35387ede75bd ]---
>
> Signed-off-by: Xiaojie Yuan <xiaojie.yuan at amd.com>
> Fixes: 5206b0e79cfe ("drm/amdgpu: add firmware header printing for psp fw loading")
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 30 ++++++++++++-------------
> 1 file changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index 81ea5b796d85..63d07cbd2a5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -944,21 +944,13 @@ static void psp_print_fw_hdr(struct psp_context *psp,
> struct amdgpu_firmware_info *ucode)
> {
> struct amdgpu_device *adev = psp->adev;
> - const struct sdma_firmware_header_v1_0 *sdma_hdr =
> - (const struct sdma_firmware_header_v1_0 *)
> - adev->sdma.instance[ucode->ucode_id - AMDGPU_UCODE_ID_SDMA0].fw->data;
> - const struct gfx_firmware_header_v1_0 *ce_hdr =
> - (const struct gfx_firmware_header_v1_0 *)adev->gfx.ce_fw->data;
> - const struct gfx_firmware_header_v1_0 *pfp_hdr =
> - (const struct gfx_firmware_header_v1_0 *)adev->gfx.pfp_fw->data;
> - const struct gfx_firmware_header_v1_0 *me_hdr =
> - (const struct gfx_firmware_header_v1_0 *)adev->gfx.me_fw->data;
> - const struct gfx_firmware_header_v1_0 *mec_hdr =
> - (const struct gfx_firmware_header_v1_0 *)adev->gfx.mec_fw->data;
> - const struct rlc_firmware_header_v2_0 *rlc_hdr =
> - (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
> - const struct smc_firmware_header_v1_0 *smc_hdr =
> - (const struct smc_firmware_header_v1_0 *)adev->pm.fw->data;
> + struct sdma_firmware_header_v1_0 *sdma_hdr;
> + struct gfx_firmware_header_v1_0 *ce_hdr;
> + struct gfx_firmware_header_v1_0 *pfp_hdr;
> + struct gfx_firmware_header_v1_0 *me_hdr;
> + struct gfx_firmware_header_v1_0 *mec_hdr;
> + struct rlc_firmware_header_v2_0 *rlc_hdr;
> + struct smc_firmware_header_v1_0 *smc_hdr;
> [kevin]:
> i think it's better use comon_firmware_header to replace above data value.
> because these data structure is kind of common_firmware_header,
> and these printer functions actual use is also this type.
>
> switch (ucode->ucode_id) {
> case AMDGPU_UCODE_ID_SDMA0:
> @@ -969,24 +961,32 @@ static void psp_print_fw_hdr(struct psp_context *psp,
> case AMDGPU_UCODE_ID_SDMA5:
> case AMDGPU_UCODE_ID_SDMA6:
> case AMDGPU_UCODE_ID_SDMA7:
> + sdma_hdr = (struct sdma_firmware_header_v1_0 *)
> + adev->sdma.instance[ucode->ucode_id - AMDGPU_UCODE_ID_SDMA0].fw->data;
> amdgpu_ucode_print_sdma_hdr(&sdma_hdr->header);
> break;
> case AMDGPU_UCODE_ID_CP_CE:
> + ce_hdr = (struct gfx_firmware_header_v1_0 *)adev->gfx.ce_fw->data;
> amdgpu_ucode_print_gfx_hdr(&ce_hdr->header);
> break;
> case AMDGPU_UCODE_ID_CP_PFP:
> + pfp_hdr = (struct gfx_firmware_header_v1_0 *)adev->gfx.pfp_fw->data;
> amdgpu_ucode_print_gfx_hdr(&pfp_hdr->header);
> break;
> case AMDGPU_UCODE_ID_CP_ME:
> + me_hdr = (struct gfx_firmware_header_v1_0 *)adev->gfx.me_fw->data;
> amdgpu_ucode_print_gfx_hdr(&me_hdr->header);
> break;
> case AMDGPU_UCODE_ID_CP_MEC1:
> + mec_hdr = (struct gfx_firmware_header_v1_0 *)adev->gfx.mec_fw->data;
> amdgpu_ucode_print_gfx_hdr(&mec_hdr->header);
> break;
> case AMDGPU_UCODE_ID_RLC_G:
> + rlc_hdr = (struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
> amdgpu_ucode_print_rlc_hdr(&rlc_hdr->header);
> break;
> case AMDGPU_UCODE_ID_SMC:
> + smc_hdr = (struct smc_firmware_header_v1_0 *)adev->pm.fw->data;
> amdgpu_ucode_print_smc_hdr(&smc_hdr->header);
> break;
> default:
More information about the amd-gfx
mailing list