[PATCH] drm/amdgpu: not remove sysfs if not create sysfs
Christian König
ckoenig.leichtzumerken at gmail.com
Fri Nov 29 09:30:27 UTC 2019
Am 29.11.19 um 10:25 schrieb Tao, Yintian:
> Hi Christian
>
> Do you mean we can remove sysfs_remove_group() for pm_sysfs and ucode_sysfs at amdgpu_device_fini()?
At least I think so, the question is where this group is added?
If that is for some directory which is removed during driver unload then
the group will be removed automatically as well.
> If so , I think the sysfs directories will not be removed automatically.
> When I remove sysfs_remove_group() at amdgpu_device_fini() and reload amdgpu, then it will report the error below.
Ok in this case forget what I said. The group is added directly to the
PCI director and that one is obviously not removed when the driver unloads.
Thanks,
Christian.
>
> [ 4192.025969] [drm] fb depth is 24
> [ 4192.025970] [drm] pitch is 7680
> [ 4192.026104] checking generic (f4000000 240000) vs hw (600000000 200000000)
> [ 4192.026182] amdgpu 0000:00:07.0: fb1: amdgpudrmfb frame buffer device
> [ 4192.043546] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:07.0/fw_version'
> [ 4192.043549] CPU: 2 PID: 5423 Comm: modprobe Tainted: G OE 5.2.0-rc1 #1
> [ 4192.043550] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [ 4192.043551] Call Trace:
> [ 4192.043781] dump_stack+0x63/0x85
> [ 4192.043862] sysfs_warn_dup+0x5b/0x70
> [ 4192.043864] internal_create_group+0x36f/0x3a0
> [ 4192.043878] sysfs_create_group+0x13/0x20
> [ 4192.043970] amdgpu_ucode_sysfs_init+0x18/0x20 [amdgpu]
> [ 4192.044030] amdgpu_device_init+0xe48/0x1b00 [amdgpu]
> [ 4192.044086] amdgpu_driver_load_kms+0x5d/0x250 [amdgpu]
> [ 4192.044099] drm_dev_register+0x12b/0x1c0 [drm]
>
>
> Best Regards
> Yintian Tao
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Christian K?nig
> Sent: 2019年11月29日 16:52
> To: Das, Nirmoy <Nirmoy.Das at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Tuikov, Luben <Luben.Tuikov at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: not remove sysfs if not create sysfs
>
> Well what we do here actually looks like complete overkill to me.
>
> IIRC when the device is removed all subsequent sysfs directories are removed automatically as well.
>
> So calling sysfs_remove_group() is superflous in the first place.
>
> Regards,
> Christian.
>
> Am 29.11.19 um 09:34 schrieb Nirmoy:
>> Luben, This should take care of the warnings that you get when a navi
>> fw file is missing from initrd.
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>> On 11/29/19 9:26 AM, Yintian Tao wrote:
>>> When load amdgpu failed before create pm_sysfs and ucode_sysfs, the
>>> pm_sysfs and ucode_sysfs should not be removed.
>>> Otherwise, there will be warning call trace just like below.
>>> [ 24.836386] [drm] VCE initialized successfully.
>>> [ 24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed [
>>> 25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init [
>>> 25.889575] [drm] amdgpu: finishing device.
>>> [ 26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper
>>> [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110) [ 26.070110]
>>> [drm:gfx_v9_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed [
>>> 26.200309] [TTM] Finalizing pool allocator [ 26.200314] [TTM]
>>> Finalizing DMA pool allocator [ 26.200349] [TTM] Zone kernel: Used
>>> memory at exit: 0 KiB [ 26.200351] [TTM] Zone dma32: Used memory
>>> at exit: 0 KiB [ 26.200353] [drm] amdgpu: ttm finalized [
>>> 26.205329] ------------[ cut here ]------------ [ 26.205330] sysfs
>>> group 'fw_version' not found for kobject '0000:00:07.0'
>>> [ 26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256
>>> sysfs_remove_group+0x80/0x90
>>> [ 26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE)
>>> drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea
>>> sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd
>>> grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio
>>> crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec
>>> ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer
>>> input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel
>>> aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables
>>> x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi
>>> floppy [ 26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G OE
>>> 5.2.0-rc1 #1 [ 26.205370] Hardware name: QEMU Standard PC (i440FX +
>>> PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 26.205372]
>>> RIP: 0010:sysfs_remove_group+0x80/0x90 [ 26.205374] Code: e8 35 b9
>>> ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8
>>> f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60
>>> 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44
>>> 00 00 55
>>> [ 26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282 [
>>> 26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>>> 0000000000000006
>>> [ 26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI:
>>> ffff97ad6f817380
>>> [ 26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09:
>>> 00000000000002b3
>>> [ 26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12:
>>> ffffffffc0e58240
>>> [ 26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15:
>>> ffff97ad4db7fff0
>>> [ 26.205380] FS: 00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000)
>>> knlGS:0000000000000000
>>> [ 26.205381] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [
>>> 26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4:
>>> 00000000003606f0
>>> [ 26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [ 26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>>> 0000000000000400
>>> [ 26.205385] Call Trace:
>>> [ 26.205461] amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu] [
>>> 26.205518] amdgpu_device_fini+0x3b4/0x560 [amdgpu] [ 26.205573]
>>> amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu] [ 26.205623]
>>> amdgpu_driver_load_kms+0xcd/0x250 [amdgpu] [ 26.205637]
>>> drm_dev_register+0x12b/0x1c0 [drm] [ 26.205695]
>>> amdgpu_pci_probe+0x12a/0x1e0 [amdgpu] [ 26.205699]
>>> local_pci_probe+0x47/0xa0 [ 26.205701]
>>> pci_device_probe+0x106/0x1b0 [ 26.205704] really_probe+0x21a/0x3f0
>>> [ 26.205706] driver_probe_device+0x11c/0x140 [ 26.205707]
>>> device_driver_attach+0x58/0x60 [ 26.205709]
>>> __driver_attach+0xc3/0x140
>>>
>>> Signed-off-by: Yintian Tao <yttao at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +++
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 ++++++++++++----
>>> 2 files changed, 15 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index c3a394d841a8..958e8005a6cc 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -1041,6 +1041,9 @@ struct amdgpu_device {
>>> uint64_t unique_id;
>>> uint64_t df_perfmon_config_assign_mask[AMDGPU_MAX_DF_PERFMONS];
>>> +
>>> + bool pm_sysfs_en;
>>> + bool ucode_sysfs_en;
>>> };
>>> static inline struct amdgpu_device *amdgpu_ttm_adev(struct
>>> ttm_bo_device *bdev) diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index e7a175a6a448..3da1f84db274 100755
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -3013,12 +3013,18 @@ int amdgpu_device_init(struct amdgpu_device
>>> *adev,
>>> amdgpu_pm_virt_sysfs_init(adev);
>>> r = amdgpu_pm_sysfs_init(adev);
>>> - if (r)
>>> + if (r) {
>>> + adev->pm_sysfs_en = false;
>>> DRM_ERROR("registering pm debugfs failed (%d).\n", r);
>>> + } else
>>> + adev->pm_sysfs_en = true;
>>> r = amdgpu_ucode_sysfs_init(adev);
>>> - if (r)
>>> + if (r) {
>>> + adev->ucode_sysfs_en = false;
>>> DRM_ERROR("Creating firmware sysfs failed (%d).\n", r);
>>> + } else
>>> + adev->ucode_sysfs_en = true;
>>> r = amdgpu_debugfs_gem_init(adev);
>>> if (r)
>>> @@ -3113,7 +3119,8 @@ void amdgpu_device_fini(struct amdgpu_device
>>> *adev)
>>> #endif
>>> }
>>> amdgpu_fence_driver_fini(adev);
>>> - amdgpu_pm_sysfs_fini(adev);
>>> + if (adev->pm_sysfs_en)
>>> + amdgpu_pm_sysfs_fini(adev);
>>> amdgpu_fbdev_fini(adev);
>>> r = amdgpu_device_ip_fini(adev);
>>> if (adev->firmware.gpu_info_fw) { @@ -3148,7 +3155,8 @@ void
>>> amdgpu_device_fini(struct amdgpu_device
>>> *adev)
>>> amdgpu_debugfs_regs_cleanup(adev);
>>> device_remove_file(adev->dev, &dev_attr_pcie_replay_count);
>>> - amdgpu_ucode_sysfs_fini(adev);
>>> + if (adev->ucode_sysfs_en)
>>> + amdgpu_ucode_sysfs_fini(adev);
>>> if (IS_ENABLED(CONFIG_PERF_EVENTS))
>>> amdgpu_pmu_fini(adev);
>>> amdgpu_debugfs_preempt_cleanup(adev);
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cyi
>> ntian.tao%40amd.com%7C39f2c6ab2bf3467816a408d774a96c06%7C3dd8961fe4884
>> e608e11a82d994e183d%7C0%7C0%7C637106143370085051&sdata=d47%2FJgUad
>> k2wcbpT8kWEVk8%2B2YoehHKvNWdUq8RIEEo%3D&reserved=0
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cyintian.tao%40amd.com%7C39f2c6ab2bf3467816a408d774a96c06%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637106143370085051&sdata=d47%2FJgUadk2wcbpT8kWEVk8%2B2YoehHKvNWdUq8RIEEo%3D&reserved=0
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list