[PATCH] drm/amdgpu: Add GDS clearing workaround in later init for gfx9

James Zhu jamesz at amd.com
Mon Jun 10 14:21:08 UTC 2019


On 2019-06-08 8:50 a.m., Christian König wrote:
> Yeah, that would be a good idea as well.
>
> Additional to that I suggest to update GDS_VMID0_BASE/_SIZE with 
> direct register writes instead of PM4 packets.

I can do so.

James

>
> Christian.
>
> Am 07.06.19 um 22:16 schrieb Shamis, Leonid:
>> James,
>>
>> Do you set GDS_VMID0_BASE to 0?.. I don't see it in your patch.
>>
>> Regards,
>> Leonid
>>
>> On 2019-06-07, 15:42, "Zhu, James" <James.Zhu at amd.com> wrote:
>>
>>           On 2019-06-07 3:12 p.m., Zhu, James wrote:
>>      > On 2019-06-07 2:16 p.m., Alex Deucher wrote:
>>      >> On Fri, Jun 7, 2019 at 12:38 PM Zhu, James 
>> <James.Zhu at amd.com> wrote:
>>      >>> Since Hardware bug, GDS exist ECC error after cold boot up,
>>      >>> adding GDS clearing workaround in later init for gfx9.
>>      >>>
>>      >>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>>      >>> ---
>>      >>>    drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 48 
>> +++++++++++++++++++++++++++++++++++
>>      >>>    1 file changed, 48 insertions(+)
>>      >>>
>>      >>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>      >>> index 76722fc..81f6ba8 100644
>>      >>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>      >>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>      >>> @@ -3634,6 +3634,50 @@ static const struct soc15_reg_entry 
>> sec_ded_counter_registers[] = {
>>      >>>       { SOC15_REG_ENTRY(GC, 0, mmSQC_EDC_CNT3), 0, 4, 6},
>>      >>>    };
>>      >>>
>>      >>> +
>>      >>> +static int gfx_v9_0_do_edc_gds_workarounds(struct 
>> amdgpu_device *adev)
>>      >>> +{
>>      >>> +       struct amdgpu_ring *ring = &adev->gfx.compute_ring[0];
>>      >>> +       int r;
>>      >>> +
>>      >>> +       r = amdgpu_ring_alloc(ring, 17);
>>      >>> +       if (r) {
>>      >>> +               DRM_ERROR("amdgpu: GDS workarounds failed to 
>> lock ring %s (%d).\n",
>>      >>> +                       ring->name, r);
>>      >>> +               return r;
>>      >>> +       }
>>      >>> +
>>      >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 
>> 3));
>>      >>> +       amdgpu_ring_write(ring, WRITE_DATA_ENGINE_SEL(0) |
>>      >>> + WRITE_DATA_DST_SEL(0));
>>      >>> +       amdgpu_ring_write(ring, SOC15_REG_OFFSET(GC, 0, 
>> mmGDS_VMID0_SIZE));
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 0x10000);
>>      >> hardcoded size, please use the size from the driver.
>>      >>
>>      >>> +
>>      >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_DMA_DATA, 5));
>>      >>> +       amdgpu_ring_write(ring, (PACKET3_DMA_DATA_CP_SYNC |
>>      >>> + PACKET3_DMA_DATA_DST_SEL(1) |
>>      >>> + PACKET3_DMA_DATA_SRC_SEL(2) |
>>      >>> + PACKET3_DMA_DATA_ENGINE(0)));
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 
>> PACKET3_DMA_DATA_CMD_RAW_WAIT | 0x10000);
>>      >> Instead of hardcoding the size, can you use the gds size from 
>> the
>>      >> driver (adev->gds.gds_size).
>>      > Hi Alex,
>>      >
>>      > Do you mean adev->gds.mem.total_size?
>>      >
>>      > But I see below operation in gfx_v9_0_ngg_init.
>>      >
>>      > adev->gds.mem.total_size -= ....
>>      >
>>      > Or you want me to add gds_size in struct amdgpu_gds?
>>      >
>>      > James
>>           Yeah, The amd-staging-drm-next branch has adev->gds.gds_size.
>>           James
>>           >
>>      >> With that fixed:
>>      >> Reviewed-by: Alex Deucher <alexander.deucher at amd.com>
>>      >>
>>      >>> +
>>      >>> +       amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 
>> 3));
>>      >>> +       amdgpu_ring_write(ring, WRITE_DATA_ENGINE_SEL(0) |
>>      >>> + WRITE_DATA_DST_SEL(0));
>>      >>> +       amdgpu_ring_write(ring, SOC15_REG_OFFSET(GC, 0, 
>> mmGDS_VMID0_SIZE));
>>      >>> +       amdgpu_ring_write(ring, 0);
>>      >>> +       amdgpu_ring_write(ring, 0x0);
>>      >>> +
>>      >>> +       amdgpu_ring_commit(ring);
>>      >>> +
>>      >>> +       return 0;
>>      >>> +}
>>      >>> +
>>      >>> +
>>      >>>    static int gfx_v9_0_do_edc_gpr_workarounds(struct 
>> amdgpu_device *adev)
>>      >>>    {
>>      >>>           struct amdgpu_ring *ring = 
>> &adev->gfx.compute_ring[0];
>>      >>> @@ -3810,6 +3854,10 @@ static int 
>> gfx_v9_0_ecc_late_init(void *handle)
>>      >>>                   return 0;
>>      >>>           }
>>      >>>
>>      >>> +       r = gfx_v9_0_do_edc_gds_workarounds(adev);
>>      >>> +       if (r)
>>      >>> +               return r;
>>      >>> +
>>      >>>           /* requires IBs so do in late init after IB pool 
>> is initialized */
>>      >>>           r = gfx_v9_0_do_edc_gpr_workarounds(adev);
>>      >>>           if (r)
>>      >>> --
>>      >>> 2.7.4
>>      >>>
>>      >>> _______________________________________________
>>      >>> amd-gfx mailing list
>>      >>> amd-gfx at lists.freedesktop.org
>>      >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>


More information about the amd-gfx mailing list