[PATCH v2] drm/amdgpu: add workarounds for Raven VCN TMZ issue

Fri Mar 11 02:32:17 UTC 2022

On 03/10/ , Christian König wrote:
> Ok, thanks.
> 
> Lang is that case your patch should work fine.
> 
> Just add another patch with a check for the encode case to reject any CS
> with TMZ buffers in it.

Only VCN decode ring is cared in this patch. For encode ring
(AMDGPU_HW_IP_VCN_ENC and AMDGPU_HW_IP_VCN_JPEG), is it fine 
we just reject secure IBs in amdgpu_ib_schedule like compute ring?

Regards,
Lang

> Thanks,
> Christian.
> 
> Am 10.03.22 um 15:17 schrieb Leo Liu:
> > No need for encode. Encrypting uses TEE/TA to convert clear bitstream to
> > encrypted bitstream, and has nothing to do with VCN encode and tmz.
> > 
> > Regards,
> > 
> > Leo
> > 
> > 
> > On 2022-03-10 04:53, Christian König wrote:
> > > Leo you didn't answered the question if we need TMZ for encode as well.
> > > 
> > > Regards,
> > > Christian.
> > > 
> > > Am 10.03.22 um 09:45 schrieb Lang Yu:
> > > > Ping.
> > > > 
> > > > On 03/08/ , Leo Liu wrote:
> > > > > On 2022-03-08 11:18, Leo Liu wrote:
> > > > > > On 2022-03-08 04:16, Christian König wrote:
> > > > > > > Am 08.03.22 um 09:06 schrieb Lang Yu:
> > > > > > > > On 03/08/ , Christian König wrote:
> > > > > > > > > Am 08.03.22 um 08:33 schrieb Lang Yu:
> > > > > > > > > > On 03/08/ , Christian König wrote:
> > > > > > > > > > > Am 08.03.22 um 04:39 schrieb Lang Yu:
> > > > > > > > > > > > It is a hardware issue that VCN can't handle a GTT
> > > > > > > > > > > > backing stored TMZ buffer on Raven.
> > > > > > > > > > > > 
> > > > > > > > > > > > Move such a TMZ buffer to VRAM domain before command
> > > > > > > > > > > > submission.
> > > > > > > > > > > > 
> > > > > > > > > > > > v2:
> > > > > > > > > > > >      - Use patch_cs_in_place callback.
> > > > > > > > > > > > 
> > > > > > > > > > > > Suggested-by: Christian König <christian.koenig at amd.com>
> > > > > > > > > > > > Signed-off-by: Lang Yu <Lang.Yu at amd.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c | 68
> > > > > > > > > > > > +++++++++++++++++++++++++++
> > > > > > > > > > > >      1 file changed, 68 insertions(+)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git
> > > > > > > > > > > > a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > > > > > > > > > b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > > > > > > > > > index 7bbb9ba6b80b..810932abd3af 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
> > > > > > > > > > > > @@ -24,6 +24,7 @@
> > > > > > > > > > > >      #include <linux/firmware.h>
> > > > > > > > > > > >      #include "amdgpu.h"
> > > > > > > > > > > > +#include "amdgpu_cs.h"
> > > > > > > > > > > >      #include "amdgpu_vcn.h"
> > > > > > > > > > > >      #include "amdgpu_pm.h"
> > > > > > > > > > > >      #include "soc15.h"
> > > > > > > > > > > > @@ -1905,6 +1906,72 @@ static const struct
> > > > > > > > > > > > amd_ip_funcs vcn_v1_0_ip_funcs = {
> > > > > > > > > > > >          .set_powergating_state
> > > > > > > > > > > > =
> > > > > > > > > > > > vcn_v1_0_set_powergating_state,
> > > > > > > > > > > >      };
> > > > > > > > > > > > +/**
> > > > > > > > > > > > + * It is a hardware issue that Raven VCN can't
> > > > > > > > > > > > handle a GTT TMZ buffer.
> > > > > > > > > > > > + * Move such a GTT TMZ buffer to VRAM domain
> > > > > > > > > > > > before command submission.
> > > > > > > > > > > > + */
> > > > > > > > > > > > +static int
> > > > > > > > > > > > vcn_v1_0_validate_bo(struct
> > > > > > > > > > > > amdgpu_cs_parser *parser,
> > > > > > > > > > > > +                struct amdgpu_job *job,
> > > > > > > > > > > > +                uint64_t addr)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +    struct ttm_operation_ctx ctx = { false, false };
> > > > > > > > > > > > +    struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
> > > > > > > > > > > > +    struct amdgpu_vm *vm = &fpriv->vm;
> > > > > > > > > > > > +    struct amdgpu_bo_va_mapping *mapping;
> > > > > > > > > > > > +    struct amdgpu_bo *bo;
> > > > > > > > > > > > +    int r;
> > > > > > > > > > > > +
> > > > > > > > > > > > +    addr &= AMDGPU_GMC_HOLE_MASK;
> > > > > > > > > > > > +    if (addr & 0x7) {
> > > > > > > > > > > > +        DRM_ERROR("VCN messages must be 8 byte aligned!\n");
> > > > > > > > > > > > +        return -EINVAL;
> > > > > > > > > > > > +    }
> > > > > > > > > > > > +
> > > > > > > > > > > > +    mapping = amdgpu_vm_bo_lookup_mapping(vm,
> > > > > > > > > > > > addr/AMDGPU_GPU_PAGE_SIZE);
> > > > > > > > > > > > +    if (!mapping ||
> > > > > > > > > > > > !mapping->bo_va ||
> > > > > > > > > > > > !mapping->bo_va->base.bo)
> > > > > > > > > > > > +        return -EINVAL;
> > > > > > > > > > > > +
> > > > > > > > > > > > +    bo = mapping->bo_va->base.bo;
> > > > > > > > > > > > +    if (!(bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED))
> > > > > > > > > > > > +        return 0;
> > > > > > > > > > > > +
> > > > > > > > > > > > +   
> > > > > > > > > > > > amdgpu_bo_placement_from_domain(bo,
> > > > > > > > > > > > AMDGPU_GEM_DOMAIN_VRAM);
> > > > > > > > > > > > +    r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx);
> > > > > > > > > > > > +    if (r) {
> > > > > > > > > > > > +        DRM_ERROR("Failed validating the VCN
> > > > > > > > > > > > message BO (%d)!\n", r);
> > > > > > > > > > > > +        return r;
> > > > > > > > > > > > +    }
> > > > > > > > > > > Well, exactly that won't work.
> > > > > > > > > > > 
> > > > > > > > > > > The message structure isn't TMZ protected because
> > > > > > > > > > > otherwise the driver won't
> > > > > > > > > > > be able to stitch it together.
> > > > > > > > > > > 
> > > > > > > > > > > What is TMZ protected are the surfaces the message
> > > > > > > > > > > structure is pointing to.
> > > > > > > > > > > So what you would need to do is to completely parse
> > > > > > > > > > > the structure and then
> > > > > > > > > > > move on the relevant buffers into VRAM.
> > > > > > > > > > > 
> > > > > > > > > > > Leo or James, can you help with that?
> > > > > > > > > >    From my observations, when decoding secure contents, register
> > > > > > > > > > GPCOM_VCPU_DATA0 and GPCOM_VCPU_DATA1 are set to a TMZ
> > > > > > > > > > buffer address.
> > > > > > > > > > And this way works when allocating TMZ buffers in GTT domain.
> > > > > > > > > As far as I remember that's only the case for the decoding,
> > > > > > > > > encoding works
> > > > > > > > > by putting the addresses into the message buffer.
> > > > > > > > > 
> > > > > > > > > But could be that decoding is sufficient, Leo and James need
> > > > > > > > > to comment on
> > > > > > > > > this.
> > > > > > > > It seems that only decode needs TMZ buffers. Only observe
> > > > > > > > si_vid_create_tmz_buffer()
> > > > > > > > was called in rvcn_dec_message_decode() in
> > > > > > > > src/gallium/drivers/radeon/radeon_vcn_dec.c.
> > > > > > > Mhm, good point. Let's wait for Leo and James to wake up, when we
> > > > > > > don't need encode support than that would makes things much easier.
> > > > > > For secure playback, the buffer required in TMZ are dpb,
> > > > > > dt and ctx, for
> > > > > > the rest esp. those for CPU access don't need that E.g.
> > > > > > msg buffer, and
> > > > > > bitstream buffer.
> > > > > > 
> > > > > >  From radeon_vcn_dec.c, you can see the buffer for dpb
> > > > > > and ctx, and dt
> > > > > > buffer frontend/va/surface is set to PIPE_BIND_PROTECTED.
> > > > > > 
> > > > > > 
> > > > > > Regards,
> > > > > > 
> > > > > > Leo
> > > > > > 
> > > > > For VCN1, due to performance reason, the msg and fb buffer
> > > > > was allocated
> > > > > into VRAM instead of GTT(for other HW), but those are not
> > > > > TMZ in order to
> > > > > have CPU access.
> > > > > 
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Leo
> > > > > 
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > Regards,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Lang
> > > > > > > > 
> > > > > > > > > Regards,
> > > > > > > > > Christian.
> > > > > > > > > 
> > > > > > > > > > Regards,
> > > > > > > > > > Lang
> > > > > > > > > > 
> > > > > > > > > > > Regards,
> > > > > > > > > > > Christian.
> > > > > > > > > > > 
> > > > > > > > > > > > +
> > > > > > > > > > > > +    return r;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > +static int
> > > > > > > > > > > > vcn_v1_0_ring_patch_cs_in_place(struct
> > > > > > > > > > > > amdgpu_cs_parser *p,
> > > > > > > > > > > > +                       struct amdgpu_job *job,
> > > > > > > > > > > > +                       struct amdgpu_ib *ib)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +    uint32_t msg_lo = 0, msg_hi = 0;
> > > > > > > > > > > > +    int i, r;
> > > > > > > > > > > > +
> > > > > > > > > > > > +    for (i = 0; i < ib->length_dw; i += 2) {
> > > > > > > > > > > > +        uint32_t reg = amdgpu_ib_get_value(ib, i);
> > > > > > > > > > > > +        uint32_t val = amdgpu_ib_get_value(ib, i + 1);
> > > > > > > > > > > > +
> > > > > > > > > > > > +        if (reg == PACKET0(p->adev->vcn.internal.data0, 0)) {
> > > > > > > > > > > > +            msg_lo = val;
> > > > > > > > > > > > +        } else if (reg ==
> > > > > > > > > > > > PACKET0(p->adev->vcn.internal.data1, 0)) {
> > > > > > > > > > > > +            msg_hi = val;
> > > > > > > > > > > > +        } else if (reg ==
> > > > > > > > > > > > PACKET0(p->adev->vcn.internal.cmd,
> > > > > > > > > > > > 0)) {
> > > > > > > > > > > > +            r = vcn_v1_0_validate_bo(p, job,
> > > > > > > > > > > > +                         ((u64)msg_hi) << 32 | msg_lo);
> > > > > > > > > > > > +            if (r)
> > > > > > > > > > > > +                return r;
> > > > > > > > > > > > +        }
> > > > > > > > > > > > +    }
> > > > > > > > > > > > +
> > > > > > > > > > > > +    return 0;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > +
> > > > > > > > > > > >      static const struct amdgpu_ring_funcs
> > > > > > > > > > > > vcn_v1_0_dec_ring_vm_funcs = {
> > > > > > > > > > > >          .type = AMDGPU_RING_TYPE_VCN_DEC,
> > > > > > > > > > > >          .align_mask = 0xf,
> > > > > > > > > > > > @@ -1914,6 +1981,7 @@ static const struct
> > > > > > > > > > > > amdgpu_ring_funcs vcn_v1_0_dec_ring_vm_funcs = {
> > > > > > > > > > > >          .get_rptr = vcn_v1_0_dec_ring_get_rptr,
> > > > > > > > > > > >          .get_wptr = vcn_v1_0_dec_ring_get_wptr,
> > > > > > > > > > > >          .set_wptr = vcn_v1_0_dec_ring_set_wptr,
> > > > > > > > > > > > +    .patch_cs_in_place = vcn_v1_0_ring_patch_cs_in_place,
> > > > > > > > > > > >          .emit_frame_size =
> > > > > > > > > > > >              6 + 6 + /* hdp invalidate / flush */
> > > > > > > > > > > >              SOC15_FLUSH_GPU_TLB_NUM_WREG * 6 +
> > > 
>