[PATCH v2] drm/amdgpu: add workarounds for Raven VCN TMZ issue

Tue Mar 8 16:30:16 UTC 2022

On 2022-03-08 11:18, Leo Liu wrote:
>
> On 2022-03-08 04:16, Christian König wrote:
>> Am 08.03.22 um 09:06 schrieb Lang Yu:
>>> On 03/08/ , Christian König wrote:
>>>> Am 08.03.22 um 08:33 schrieb Lang Yu:
>>>>> On 03/08/ , Christian König wrote:
>>>>>> Am 08.03.22 um 04:39 schrieb Lang Yu:
>>>>>>> It is a hardware issue that VCN can't handle a GTT
>>>>>>> backing stored TMZ buffer on Raven.
>>>>>>>
>>>>>>> Move such a TMZ buffer to VRAM domain before command
>>>>>>> submission.
>>>>>>>
>>>>>>> v2:
>>>>>>>     - Use patch_cs_in_place callback.
>>>>>>>
>>>>>>> Suggested-by: Christian König <christian.koenig at amd.com>
>>>>>>> Signed-off-by: Lang Yu <Lang.Yu at amd.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 68 
>>>>>>> +++++++++++++++++++++++++++
>>>>>>>     1 file changed, 68 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c 
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>> index 7bbb9ba6b80b..810932abd3af 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>> @@ -24,6 +24,7 @@
>>>>>>>     #include <linux/firmware.h>
>>>>>>>     #include "amdgpu.h"
>>>>>>> +#include "amdgpu_cs.h"
>>>>>>>     #include "amdgpu_vcn.h"
>>>>>>>     #include "amdgpu_pm.h"
>>>>>>>     #include "soc15.h"
>>>>>>> @@ -1905,6 +1906,72 @@ static const struct amd_ip_funcs 
>>>>>>> vcn_v1_0_ip_funcs = {
>>>>>>>         .set_powergating_state = vcn_v1_0_set_powergating_state,
>>>>>>>     };
>>>>>>> +/**
>>>>>>> + * It is a hardware issue that Raven VCN can't handle a GTT TMZ 
>>>>>>> buffer.
>>>>>>> + * Move such a GTT TMZ buffer to VRAM domain before command 
>>>>>>> submission.
>>>>>>> + */
>>>>>>> +static int vcn_v1_0_validate_bo(struct amdgpu_cs_parser *parser,
>>>>>>> +                struct amdgpu_job *job,
>>>>>>> +                uint64_t addr)
>>>>>>> +{
>>>>>>> +    struct ttm_operation_ctx ctx = { false, false };
>>>>>>> +    struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
>>>>>>> +    struct amdgpu_vm *vm = &fpriv->vm;
>>>>>>> +    struct amdgpu_bo_va_mapping *mapping;
>>>>>>> +    struct amdgpu_bo *bo;
>>>>>>> +    int r;
>>>>>>> +
>>>>>>> +    addr &= AMDGPU_GMC_HOLE_MASK;
>>>>>>> +    if (addr & 0x7) {
>>>>>>> +        DRM_ERROR("VCN messages must be 8 byte aligned!\n");
>>>>>>> +        return -EINVAL;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    mapping = amdgpu_vm_bo_lookup_mapping(vm, 
>>>>>>> addr/AMDGPU_GPU_PAGE_SIZE);
>>>>>>> +    if (!mapping || !mapping->bo_va || !mapping->bo_va->base.bo)
>>>>>>> +        return -EINVAL;
>>>>>>> +
>>>>>>> +    bo = mapping->bo_va->base.bo;
>>>>>>> +    if (!(bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED))
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    amdgpu_bo_placement_from_domain(bo, AMDGPU_GEM_DOMAIN_VRAM);
>>>>>>> +    r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>>>>>>> +    if (r) {
>>>>>>> +        DRM_ERROR("Failed validating the VCN message BO 
>>>>>>> (%d)!\n", r);
>>>>>>> +        return r;
>>>>>>> +    }
>>>>>> Well, exactly that won't work.
>>>>>>
>>>>>> The message structure isn't TMZ protected because otherwise the 
>>>>>> driver won't
>>>>>> be able to stitch it together.
>>>>>>
>>>>>> What is TMZ protected are the surfaces the message structure is 
>>>>>> pointing to.
>>>>>> So what you would need to do is to completely parse the structure 
>>>>>> and then
>>>>>> move on the relevant buffers into VRAM.
>>>>>>
>>>>>> Leo or James, can you help with that?
>>>>>   From my observations, when decoding secure contents, register
>>>>> GPCOM_VCPU_DATA0 and GPCOM_VCPU_DATA1 are set to a TMZ buffer 
>>>>> address.
>>>>> And this way works when allocating TMZ buffers in GTT domain.
>>>> As far as I remember that's only the case for the decoding, 
>>>> encoding works
>>>> by putting the addresses into the message buffer.
>>>>
>>>> But could be that decoding is sufficient, Leo and James need to 
>>>> comment on
>>>> this.
>>> It seems that only decode needs TMZ buffers. Only observe 
>>> si_vid_create_tmz_buffer()
>>> was called in rvcn_dec_message_decode() in 
>>> src/gallium/drivers/radeon/radeon_vcn_dec.c.
>>
>> Mhm, good point. Let's wait for Leo and James to wake up, when we 
>> don't need encode support than that would makes things much easier.
>
> For secure playback, the buffer required in TMZ are dpb, dt and ctx, 
> for the rest esp. those for CPU access don't need that E.g. msg 
> buffer, and bitstream buffer.
>
> From radeon_vcn_dec.c, you can see the buffer for dpb and ctx, and dt 
> buffer frontend/va/surface is set to PIPE_BIND_PROTECTED.
>
>
> Regards,
>
> Leo
>
For VCN1, due to performance reason, the msg and fb buffer was allocated 
into VRAM instead of GTT(for other HW), but those are not TMZ in order 
to have CPU access.


Regards,

Leo


>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>> Lang
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Lang
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> +
>>>>>>> +    return r;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int vcn_v1_0_ring_patch_cs_in_place(struct 
>>>>>>> amdgpu_cs_parser *p,
>>>>>>> +                       struct amdgpu_job *job,
>>>>>>> +                       struct amdgpu_ib *ib)
>>>>>>> +{
>>>>>>> +    uint32_t msg_lo = 0, msg_hi = 0;
>>>>>>> +    int i, r;
>>>>>>> +
>>>>>>> +    for (i = 0; i < ib->length_dw; i += 2) {
>>>>>>> +        uint32_t reg = amdgpu_ib_get_value(ib, i);
>>>>>>> +        uint32_t val = amdgpu_ib_get_value(ib, i + 1);
>>>>>>> +
>>>>>>> +        if (reg == PACKET0(p->adev->vcn.internal.data0, 0)) {
>>>>>>> +            msg_lo = val;
>>>>>>> +        } else if (reg == PACKET0(p->adev->vcn.internal.data1, 
>>>>>>> 0)) {
>>>>>>> +            msg_hi = val;
>>>>>>> +        } else if (reg == PACKET0(p->adev->vcn.internal.cmd, 0)) {
>>>>>>> +            r = vcn_v1_0_validate_bo(p, job,
>>>>>>> +                         ((u64)msg_hi) << 32 | msg_lo);
>>>>>>> +            if (r)
>>>>>>> +                return r;
>>>>>>> +        }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> +
>>>>>>>     static const struct amdgpu_ring_funcs 
>>>>>>> vcn_v1_0_dec_ring_vm_funcs = {
>>>>>>>         .type = AMDGPU_RING_TYPE_VCN_DEC,
>>>>>>>         .align_mask = 0xf,
>>>>>>> @@ -1914,6 +1981,7 @@ static const struct amdgpu_ring_funcs 
>>>>>>> vcn_v1_0_dec_ring_vm_funcs = {
>>>>>>>         .get_rptr = vcn_v1_0_dec_ring_get_rptr,
>>>>>>>         .get_wptr = vcn_v1_0_dec_ring_get_wptr,
>>>>>>>         .set_wptr = vcn_v1_0_dec_ring_set_wptr,
>>>>>>> +    .patch_cs_in_place = vcn_v1_0_ring_patch_cs_in_place,
>>>>>>>         .emit_frame_size =
>>>>>>>             6 + 6 + /* hdp invalidate / flush */
>>>>>>>             SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 +
>>