[PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3

James Zhu jamesz at amd.com
Mon Jun 25 19:15:12 UTC 2018



On 2018-06-25 03:02 PM, Alex Deucher wrote:
> On Mon, Jun 25, 2018 at 2:59 PM, James Zhu <jamesz at amd.com> wrote:
>>
>> On 2018-06-25 02:53 PM, Alex Deucher wrote:
>>
>> On Mon, Jun 25, 2018 at 2:37 PM, James Zhu <jamesz at amd.com> wrote:
>>
>> For one UVD instance case,:
>>
>>
>> In function amdgpu_driver_load_kms, all ring->me should be set to zero.
>>      adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>
>>
>> For two UVD instances cases:
>>
>> static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev)
>> ..
>>      for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>          adev->uvd.inst[i].ring.me = i;
>>
>> static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev)
>>
>>      for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
>>              adev->uvd.inst[j].ring_enc[i].me = j;
>>
>> uvd_v4_2_early_init in uvd_v4_2.c  adev->uvd.num_uvd_inst = 1;
>> uvd_v5_0_early_init in uvd_v5_0.c  adev->uvd.num_uvd_inst = 1;
>> uvd_v6_0_early_init in uvd_v6_0.c  adev->uvd.num_uvd_inst = 1;
>> uvd_v7_0_early_init in uvd_v7_0.c
>>      if (adev->asic_type == CHIP_VEGA20)
>>          adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;/*2*/
>>      else
>>          adev->uvd.num_uvd_inst = 1;
>>
>>
>> I didn't know when ring->me is set to 2. Maybe there is some leakage
>> somewhere.
>>
>> What about older uvd (4.2, 5.0, 6.0) blocks?
>>
>> I think the below code will reset
>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring->me and
>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]->me
>> to zero.
>> for older uvd IP UVD block.
>>
>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>
>> Do I understand correctly?
> Yes, it should.  That's why it doesn't make sense that it would be
> getting another value.
>
> Alex
     ring point is refer in struct amdgpu_device. Miss using it may 
cause simple leakage.
    struct amdgpu_device {
    ..............
      struct amdgpu_ring        **/rings/*[AMDGPU_MAX_RINGS];

The simple possible leakage could happen at
./gfx_v8_0.c:1995:    ring->me = mec + 1;
./amdgpu_gfx.c:190:        ring->me = mec + 1;
./gfx_v9_0.c:1406:    ring->me = mec + 1;
./gfx_v7_0.c:4471:    ring->me = mec + 1;

Timothy,

It is not easy to find root cause based on current information.
What asic are you using on this test. IS this UBSAN test open source?
Is it easy for you guide me to reproduce it on my bench?

James

>> James
>>
>> Alex
>>
>> Best regards!
>>
>> James zhu
>>
>>
>> On 2018-06-25 01:29 PM, Deucher, Alexander wrote:
>>
>> Odd. The structure should be 0 initialized.  Does this patch help?
>>
>>
>> Alex
>>
>> ________________________________
>> From: Timothy Pearson <tpearson at raptorengineering.com>
>> Sent: Monday, June 25, 2018 11:53:12 AM
>> To: Zhu, James
>> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander; Zhou,
>> David(ChunMing); Koenig, Christian
>> Subject: Re: [PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3
>>
>> n 06/25/2018 09:46 AM, James Zhu wrote:
>>
>> On 2018-06-23 08:02 PM, Timothy Pearson wrote:
>>
>> amdgpu_fence_driver_start_ring() attempts to access
>> UVD instance 2 during setup, while the existing UVD
>> instance count only allows instances 0 and 1.
>>
>> Increase AMDGPU_MAX_UVD_INSTANCES by one to avoid the
>> invalid array access.
>>
>> Caught by UBSAN.
>>
>> Hi Timothy,
>>
>>  From design of view, it is not right to just change
>> AMDGPU_MAX_UVD_INSTANCES to 3.
>>
>> Could you tell me some detail of UBSAN test and attach the dmesg also?
>>
>> Definitely, was looking for some feedback from anyone knowing more about
>> the internals of the UVD system.
>>
>> What's happening is that "ring->me" in amdgpu_fence_driver_start_ring()
>> (drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379) is set to a value of
>> "2".  The overall dmesg is otherwise uninteresting, but I can try to
>> grab the UBSAN output if needed.
>>
>> --
>> Timothy Pearson
>> Raptor Engineering
>> +1 (415) 727-8645 (direct line)
>> +1 (512) 690-0200 (switchboard)
>> https://www.raptorengineering.com
>>
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180625/3942b3d5/attachment-0001.html>


More information about the amd-gfx mailing list