[PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3
Leo Liu
leo.liu at amd.com
Mon Jun 25 19:13:58 UTC 2018
The problem is reproducible with enable UBSAN.
================================================================================
[ 3.866643] UBSAN: Undefined behaviour in
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379:29
[ 3.866656] index 2 is out of range for type 'amdgpu_uvd_inst [2]'
[ 3.866667] CPU: 0 PID: 59 Comm: kworker/0:1 Not tainted 4.16.0-rc7+ #3
[ 3.866677] Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD7/GA-990FXA-UD7, BIOS F9 06/08/2012
[ 3.866693] Workqueue: events work_for_cpu_fn
[ 3.866702] Call Trace:
[ 3.866710] dump_stack+0x85/0xc5
[ 3.866719] ubsan_epilogue+0x9/0x40
[ 3.866727] __ubsan_handle_out_of_bounds+0x89/0x90
[ 3.866737] ? rcu_read_lock_sched_held+0x58/0x60
[ 3.866746] ? __kmalloc+0x26c/0x2d0
[ 3.866846] amdgpu_fence_driver_start_ring+0x259/0x280 [amdgpu]
[ 3.866896] amdgpu_ring_init+0x12c/0x710 [amdgpu]
[ 3.866906] ? sprintf+0x42/0x50
[ 3.866956] amdgpu_gfx_kiq_init_ring+0x1bc/0x3a0 [amdgpu]
[ 3.867009] gfx_v8_0_sw_init+0x1ad3/0x2360 [amdgpu]
[ 3.867062] ? smu7_init+0xec/0x160 [amdgpu]
[ 3.867109] amdgpu_device_init+0x112c/0x1dc0 [amdgpu]
[ 3.867120] ? rcu_read_lock_sched_held+0x58/0x60
[ 3.867166] amdgpu_driver_load_kms+0x74/0x2e0 [amdgpu]
[ 3.867178] drm_dev_register+0x134/0x1c0
[ 3.867223] amdgpu_pci_probe+0x163/0x270 [amdgpu]
[ 3.867233] local_pci_probe+0x42/0xa0
[ 3.867242] work_for_cpu_fn+0x16/0x20
[ 3.867250] process_one_work+0x269/0x640
[ 3.867260] worker_thread+0x216/0x3d0
[ 3.867268] ? process_one_work+0x640/0x640
[ 3.867276] kthread+0x113/0x130
[ 3.867282] ? kthread_create_worker_on_cpu+0x50/0x50
[ 3.867293] ret_from_fork+0x27/0x50
[ 3.867304]
================================================================================
[ 3.869808] [drm] Found UVD firmware Version: 1.130 Family ID: 16
[ 3.871505] [drm] Found VCE firmware Version: 53.26 Binary ID: 3
The fix will follow.
Regards,
Leo
On 06/25/2018 03:02 PM, Alex Deucher wrote:
> On Mon, Jun 25, 2018 at 2:59 PM, James Zhu <jamesz at amd.com> wrote:
>>
>> On 2018-06-25 02:53 PM, Alex Deucher wrote:
>>
>> On Mon, Jun 25, 2018 at 2:37 PM, James Zhu <jamesz at amd.com> wrote:
>>
>> For one UVD instance case,:
>>
>>
>> In function amdgpu_driver_load_kms, all ring->me should be set to zero.
>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>
>>
>> For two UVD instances cases:
>>
>> static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev)
>> ..
>> for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>> adev->uvd.inst[i].ring.me = i;
>>
>> static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev)
>>
>> for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
>> adev->uvd.inst[j].ring_enc[i].me = j;
>>
>> uvd_v4_2_early_init in uvd_v4_2.c adev->uvd.num_uvd_inst = 1;
>> uvd_v5_0_early_init in uvd_v5_0.c adev->uvd.num_uvd_inst = 1;
>> uvd_v6_0_early_init in uvd_v6_0.c adev->uvd.num_uvd_inst = 1;
>> uvd_v7_0_early_init in uvd_v7_0.c
>> if (adev->asic_type == CHIP_VEGA20)
>> adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;/*2*/
>> else
>> adev->uvd.num_uvd_inst = 1;
>>
>>
>> I didn't know when ring->me is set to 2. Maybe there is some leakage
>> somewhere.
>>
>> What about older uvd (4.2, 5.0, 6.0) blocks?
>>
>> I think the below code will reset
>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring->me and
>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]->me
>> to zero.
>> for older uvd IP UVD block.
>>
>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>
>> Do I understand correctly?
> Yes, it should. That's why it doesn't make sense that it would be
> getting another value.
>
> Alex
>
>> James
>>
>> Alex
>>
>> Best regards!
>>
>> James zhu
>>
>>
>> On 2018-06-25 01:29 PM, Deucher, Alexander wrote:
>>
>> Odd. The structure should be 0 initialized. Does this patch help?
>>
>>
>> Alex
>>
>> ________________________________
>> From: Timothy Pearson <tpearson at raptorengineering.com>
>> Sent: Monday, June 25, 2018 11:53:12 AM
>> To: Zhu, James
>> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander; Zhou,
>> David(ChunMing); Koenig, Christian
>> Subject: Re: [PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3
>>
>> n 06/25/2018 09:46 AM, James Zhu wrote:
>>
>> On 2018-06-23 08:02 PM, Timothy Pearson wrote:
>>
>> amdgpu_fence_driver_start_ring() attempts to access
>> UVD instance 2 during setup, while the existing UVD
>> instance count only allows instances 0 and 1.
>>
>> Increase AMDGPU_MAX_UVD_INSTANCES by one to avoid the
>> invalid array access.
>>
>> Caught by UBSAN.
>>
>> Hi Timothy,
>>
>> From design of view, it is not right to just change
>> AMDGPU_MAX_UVD_INSTANCES to 3.
>>
>> Could you tell me some detail of UBSAN test and attach the dmesg also?
>>
>> Definitely, was looking for some feedback from anyone knowing more about
>> the internals of the UVD system.
>>
>> What's happening is that "ring->me" in amdgpu_fence_driver_start_ring()
>> (drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379) is set to a value of
>> "2". The overall dmesg is otherwise uninteresting, but I can try to
>> grab the UBSAN output if needed.
>>
>> --
>> Timothy Pearson
>> Raptor Engineering
>> +1 (415) 727-8645 (direct line)
>> +1 (512) 690-0200 (switchboard)
>> https://www.raptorengineering.com
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list