[PATCH v2] drm/amdgpu: add workarounds for Raven VCN TMZ issue

Thu Mar 10 14:25:59 UTC 2022

Ok, thanks.

Lang is that case your patch should work fine.

Just add another patch with a check for the encode case to reject any CS 
with TMZ buffers in it.

Thanks,
Christian.

Am 10.03.22 um 15:17 schrieb Leo Liu:
> No need for encode. Encrypting uses TEE/TA to convert clear bitstream 
> to encrypted bitstream, and has nothing to do with VCN encode and tmz.
>
> Regards,
>
> Leo
>
>
> On 2022-03-10 04:53, Christian König wrote:
>> Leo you didn't answered the question if we need TMZ for encode as well.
>>
>> Regards,
>> Christian.
>>
>> Am 10.03.22 um 09:45 schrieb Lang Yu:
>>> Ping.
>>>
>>> On 03/08/ , Leo Liu wrote:
>>>> On 2022-03-08 11:18, Leo Liu wrote:
>>>>> On 2022-03-08 04:16, Christian König wrote:
>>>>>> Am 08.03.22 um 09:06 schrieb Lang Yu:
>>>>>>> On 03/08/ , Christian König wrote:
>>>>>>>> Am 08.03.22 um 08:33 schrieb Lang Yu:
>>>>>>>>> On 03/08/ , Christian König wrote:
>>>>>>>>>> Am 08.03.22 um 04:39 schrieb Lang Yu:
>>>>>>>>>>> It is a hardware issue that VCN can't handle a GTT
>>>>>>>>>>> backing stored TMZ buffer on Raven.
>>>>>>>>>>>
>>>>>>>>>>> Move such a TMZ buffer to VRAM domain before command
>>>>>>>>>>> submission.
>>>>>>>>>>>
>>>>>>>>>>> v2:
>>>>>>>>>>>      - Use patch_cs_in_place callback.
>>>>>>>>>>>
>>>>>>>>>>> Suggested-by: Christian König <christian.koenig at amd.com>
>>>>>>>>>>> Signed-off-by: Lang Yu <Lang.Yu at amd.com>
>>>>>>>>>>> ---
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 68
>>>>>>>>>>> +++++++++++++++++++++++++++
>>>>>>>>>>>      1 file changed, 68 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git
>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>>>>>> index 7bbb9ba6b80b..810932abd3af 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
>>>>>>>>>>> @@ -24,6 +24,7 @@
>>>>>>>>>>>      #include <linux/firmware.h>
>>>>>>>>>>>      #include "amdgpu.h"
>>>>>>>>>>> +#include "amdgpu_cs.h"
>>>>>>>>>>>      #include "amdgpu_vcn.h"
>>>>>>>>>>>      #include "amdgpu_pm.h"
>>>>>>>>>>>      #include "soc15.h"
>>>>>>>>>>> @@ -1905,6 +1906,72 @@ static const struct
>>>>>>>>>>> amd_ip_funcs vcn_v1_0_ip_funcs = {
>>>>>>>>>>>          .set_powergating_state = 
>>>>>>>>>>> vcn_v1_0_set_powergating_state,
>>>>>>>>>>>      };
>>>>>>>>>>> +/**
>>>>>>>>>>> + * It is a hardware issue that Raven VCN can't
>>>>>>>>>>> handle a GTT TMZ buffer.
>>>>>>>>>>> + * Move such a GTT TMZ buffer to VRAM domain
>>>>>>>>>>> before command submission.
>>>>>>>>>>> + */
>>>>>>>>>>> +static int vcn_v1_0_validate_bo(struct amdgpu_cs_parser 
>>>>>>>>>>> *parser,
>>>>>>>>>>> +                struct amdgpu_job *job,
>>>>>>>>>>> +                uint64_t addr)
>>>>>>>>>>> +{
>>>>>>>>>>> +    struct ttm_operation_ctx ctx = { false, false };
>>>>>>>>>>> +    struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
>>>>>>>>>>> +    struct amdgpu_vm *vm = &fpriv->vm;
>>>>>>>>>>> +    struct amdgpu_bo_va_mapping *mapping;
>>>>>>>>>>> +    struct amdgpu_bo *bo;
>>>>>>>>>>> +    int r;
>>>>>>>>>>> +
>>>>>>>>>>> +    addr &= AMDGPU_GMC_HOLE_MASK;
>>>>>>>>>>> +    if (addr & 0x7) {
>>>>>>>>>>> +        DRM_ERROR("VCN messages must be 8 byte aligned!\n");
>>>>>>>>>>> +        return -EINVAL;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    mapping = amdgpu_vm_bo_lookup_mapping(vm,
>>>>>>>>>>> addr/AMDGPU_GPU_PAGE_SIZE);
>>>>>>>>>>> +    if (!mapping || !mapping->bo_va || 
>>>>>>>>>>> !mapping->bo_va->base.bo)
>>>>>>>>>>> +        return -EINVAL;
>>>>>>>>>>> +
>>>>>>>>>>> +    bo = mapping->bo_va->base.bo;
>>>>>>>>>>> +    if (!(bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED))
>>>>>>>>>>> +        return 0;
>>>>>>>>>>> +
>>>>>>>>>>> +    amdgpu_bo_placement_from_domain(bo, 
>>>>>>>>>>> AMDGPU_GEM_DOMAIN_VRAM);
>>>>>>>>>>> +    r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
>>>>>>>>>>> +    if (r) {
>>>>>>>>>>> +        DRM_ERROR("Failed validating the VCN
>>>>>>>>>>> message BO (%d)!\n", r);
>>>>>>>>>>> +        return r;
>>>>>>>>>>> +    }
>>>>>>>>>> Well, exactly that won't work.
>>>>>>>>>>
>>>>>>>>>> The message structure isn't TMZ protected because
>>>>>>>>>> otherwise the driver won't
>>>>>>>>>> be able to stitch it together.
>>>>>>>>>>
>>>>>>>>>> What is TMZ protected are the surfaces the message
>>>>>>>>>> structure is pointing to.
>>>>>>>>>> So what you would need to do is to completely parse
>>>>>>>>>> the structure and then
>>>>>>>>>> move on the relevant buffers into VRAM.
>>>>>>>>>>
>>>>>>>>>> Leo or James, can you help with that?
>>>>>>>>>    From my observations, when decoding secure contents, register
>>>>>>>>> GPCOM_VCPU_DATA0 and GPCOM_VCPU_DATA1 are set to a TMZ
>>>>>>>>> buffer address.
>>>>>>>>> And this way works when allocating TMZ buffers in GTT domain.
>>>>>>>> As far as I remember that's only the case for the decoding,
>>>>>>>> encoding works
>>>>>>>> by putting the addresses into the message buffer.
>>>>>>>>
>>>>>>>> But could be that decoding is sufficient, Leo and James need
>>>>>>>> to comment on
>>>>>>>> this.
>>>>>>> It seems that only decode needs TMZ buffers. Only observe
>>>>>>> si_vid_create_tmz_buffer()
>>>>>>> was called in rvcn_dec_message_decode() in
>>>>>>> src/gallium/drivers/radeon/radeon_vcn_dec.c.
>>>>>> Mhm, good point. Let's wait for Leo and James to wake up, when we
>>>>>> don't need encode support than that would makes things much easier.
>>>>> For secure playback, the buffer required in TMZ are dpb, dt and 
>>>>> ctx, for
>>>>> the rest esp. those for CPU access don't need that E.g. msg 
>>>>> buffer, and
>>>>> bitstream buffer.
>>>>>
>>>>>  From radeon_vcn_dec.c, you can see the buffer for dpb and ctx, 
>>>>> and dt
>>>>> buffer frontend/va/surface is set to PIPE_BIND_PROTECTED.
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Leo
>>>>>
>>>> For VCN1, due to performance reason, the msg and fb buffer was 
>>>> allocated
>>>> into VRAM instead of GTT(for other HW), but those are not TMZ in 
>>>> order to
>>>> have CPU access.
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Leo
>>>>
>>>>
>>>>
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> Regards,
>>>>>>> Lang
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Lang
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> +
>>>>>>>>>>> +    return r;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static int
>>>>>>>>>>> vcn_v1_0_ring_patch_cs_in_place(struct
>>>>>>>>>>> amdgpu_cs_parser *p,
>>>>>>>>>>> +                       struct amdgpu_job *job,
>>>>>>>>>>> +                       struct amdgpu_ib *ib)
>>>>>>>>>>> +{
>>>>>>>>>>> +    uint32_t msg_lo = 0, msg_hi = 0;
>>>>>>>>>>> +    int i, r;
>>>>>>>>>>> +
>>>>>>>>>>> +    for (i = 0; i < ib->length_dw; i += 2) {
>>>>>>>>>>> +        uint32_t reg = amdgpu_ib_get_value(ib, i);
>>>>>>>>>>> +        uint32_t val = amdgpu_ib_get_value(ib, i + 1);
>>>>>>>>>>> +
>>>>>>>>>>> +        if (reg == PACKET0(p->adev->vcn.internal.data0, 0)) {
>>>>>>>>>>> +            msg_lo = val;
>>>>>>>>>>> +        } else if (reg ==
>>>>>>>>>>> PACKET0(p->adev->vcn.internal.data1, 0)) {
>>>>>>>>>>> +            msg_hi = val;
>>>>>>>>>>> +        } else if (reg == 
>>>>>>>>>>> PACKET0(p->adev->vcn.internal.cmd, 0)) {
>>>>>>>>>>> +            r = vcn_v1_0_validate_bo(p, job,
>>>>>>>>>>> +                         ((u64)msg_hi) << 32 | msg_lo);
>>>>>>>>>>> +            if (r)
>>>>>>>>>>> +                return r;
>>>>>>>>>>> +        }
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +
>>>>>>>>>>>      static const struct amdgpu_ring_funcs
>>>>>>>>>>> vcn_v1_0_dec_ring_vm_funcs = {
>>>>>>>>>>>          .type = AMDGPU_RING_TYPE_VCN_DEC,
>>>>>>>>>>>          .align_mask = 0xf,
>>>>>>>>>>> @@ -1914,6 +1981,7 @@ static const struct
>>>>>>>>>>> amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
>>>>>>>>>>>          .get_rptr = vcn_v1_0_dec_ring_get_rptr,
>>>>>>>>>>>          .get_wptr = vcn_v1_0_dec_ring_get_wptr,
>>>>>>>>>>>          .set_wptr = vcn_v1_0_dec_ring_set_wptr,
>>>>>>>>>>> +    .patch_cs_in_place = vcn_v1_0_ring_patch_cs_in_place,
>>>>>>>>>>>          .emit_frame_size =
>>>>>>>>>>>              6 + 6 + /* hdp invalidate / flush */
>>>>>>>>>>>              SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 +
>>