[PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3

Leo Liu leo.liu at amd.com
Mon Jun 25 19:35:59 UTC 2018



On 06/25/2018 03:15 PM, James Zhu wrote:
>
>
>
> On 2018-06-25 03:02 PM, Alex Deucher wrote:
>> On Mon, Jun 25, 2018 at 2:59 PM, James Zhu<jamesz at amd.com>  wrote:
>>> On 2018-06-25 02:53 PM, Alex Deucher wrote:
>>>
>>> On Mon, Jun 25, 2018 at 2:37 PM, James Zhu<jamesz at amd.com>  wrote:
>>>
>>> For one UVD instance case,:
>>>
>>>
>>> In function amdgpu_driver_load_kms, all ring->me should be set to zero.
>>>      adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>>
>>>
>>> For two UVD instances cases:
>>>
>>> static void uvd_v7_0_set_ring_funcs(struct amdgpu_device *adev)
>>> ..
>>>      for (i = 0; i < adev->uvd.num_uvd_inst; i++) {
>>>          adev->uvd.inst[i].ring.me = i;
>>>
>>> static void uvd_v7_0_set_enc_ring_funcs(struct amdgpu_device *adev)
>>>
>>>      for (j = 0; j < adev->uvd.num_uvd_inst; j++) {
>>>              adev->uvd.inst[j].ring_enc[i].me = j;
>>>
>>> uvd_v4_2_early_init in uvd_v4_2.c  adev->uvd.num_uvd_inst = 1;
>>> uvd_v5_0_early_init in uvd_v5_0.c  adev->uvd.num_uvd_inst = 1;
>>> uvd_v6_0_early_init in uvd_v6_0.c  adev->uvd.num_uvd_inst = 1;
>>> uvd_v7_0_early_init in uvd_v7_0.c
>>>      if (adev->asic_type == CHIP_VEGA20)
>>>          adev->uvd.num_uvd_inst = UVD7_MAX_HW_INSTANCES_VEGA20;/*2*/
>>>      else
>>>          adev->uvd.num_uvd_inst = 1;
>>>
>>>
>>> I didn't know when ring->me is set to 2. Maybe there is some leakage
>>> somewhere.
>>>
>>> What about older uvd (4.2, 5.0, 6.0) blocks?
>>>
>>> I think the below code will reset
>>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring->me and
>>> adev->uvd.inst[AMDGPU_MAX_UVD_INSTANCES].ring_enc[AMDGPU_MAX_UVD_ENC_RINGS]->me
>>> to zero.
>>> for older uvd IP UVD block.
>>>
>>> adev = kzalloc(sizeof(struct amdgpu_device), GFP_KERNEL);
>>>
>>> Do I understand correctly?
>> Yes, it should.  That's why it doesn't make sense that it would be
>> getting another value.
>>
>> Alex
>     ring point is refer in struct amdgpu_device. Miss using it may 
> cause simple leakage.
>    struct amdgpu_device {
>    ..............
>      struct amdgpu_ring        **/rings/*[AMDGPU_MAX_RINGS];
>
> The simple possible leakage could happen at
> ./gfx_v8_0.c:1995:    ring->me = mec + 1;
> ./amdgpu_gfx.c:190:        ring->me = mec + 1;
> ./gfx_v9_0.c:1406:    ring->me = mec + 1;
> ./gfx_v7_0.c:4471:    ring->me = mec + 1;
>
> Timothy,
>
> It is not easy to find root cause based on current information.
> What asic are you using on this test. IS this UBSAN test open source?
> Is it easy for you guide me to reproduce it on my bench?
https://people.freedesktop.org/~narmstrong/meson_drm_doc/dev-tools/ubsan.html

Leo


>
> James
>>> James
>>>
>>> Alex
>>>
>>> Best regards!
>>>
>>> James zhu
>>>
>>>
>>> On 2018-06-25 01:29 PM, Deucher, Alexander wrote:
>>>
>>> Odd. The structure should be 0 initialized.  Does this patch help?
>>>
>>>
>>> Alex
>>>
>>> ________________________________
>>> From: Timothy Pearson<tpearson at raptorengineering.com>
>>> Sent: Monday, June 25, 2018 11:53:12 AM
>>> To: Zhu, James
>>> Cc:amd-gfx at lists.freedesktop.org; Deucher, Alexander; Zhou,
>>> David(ChunMing); Koenig, Christian
>>> Subject: Re: [PATCH] Increase AMDGPU_MAX_UVD_INSTANCES to 3
>>>
>>> n 06/25/2018 09:46 AM, James Zhu wrote:
>>>
>>> On 2018-06-23 08:02 PM, Timothy Pearson wrote:
>>>
>>> amdgpu_fence_driver_start_ring() attempts to access
>>> UVD instance 2 during setup, while the existing UVD
>>> instance count only allows instances 0 and 1.
>>>
>>> Increase AMDGPU_MAX_UVD_INSTANCES by one to avoid the
>>> invalid array access.
>>>
>>> Caught by UBSAN.
>>>
>>> Hi Timothy,
>>>
>>>  From design of view, it is not right to just change
>>> AMDGPU_MAX_UVD_INSTANCES to 3.
>>>
>>> Could you tell me some detail of UBSAN test and attach the dmesg also?
>>>
>>> Definitely, was looking for some feedback from anyone knowing more about
>>> the internals of the UVD system.
>>>
>>> What's happening is that "ring->me" in amdgpu_fence_driver_start_ring()
>>> (drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:379) is set to a value of
>>> "2".  The overall dmesg is otherwise uninteresting, but I can try to
>>> grab the UBSAN output if needed.
>>>
>>> --
>>> Timothy Pearson
>>> Raptor Engineering
>>> +1 (415) 727-8645 (direct line)
>>> +1 (512) 690-0200 (switchboard)
>>> https://www.raptorengineering.com
>>>
>>>
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>
>>>
>
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20180625/44e97042/attachment-0001.html>


More information about the amd-gfx mailing list