[PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

Tue Mar 5 09:42:12 UTC 2024

Am 05.03.24 um 10:33 schrieb Lazar, Lijo:
> On 3/5/2024 2:48 PM, Christian König wrote:
>> Am 05.03.24 um 10:03 schrieb Lazar, Lijo:
>>> On 3/5/2024 2:24 PM, Christian König wrote:
>>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>>>>> For VCN 4.0.3, use only the local addressing scheme while in VF
>>>>> mode. This includes addressing scheme used for HUB offsets.
>>>>>
>>>>> Signed-off-by: Lijo Lazar <lijo.lazar at amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++++++++++++++++++-
>>>>>     1 file changed, 19 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>>>>> index 7b5ad13b618e..a27f3f260aab 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>>>>> @@ -1381,6 +1381,24 @@ static uint64_t
>>>>> vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
>>>>>                         regUVD_RB_WPTR);
>>>>>     }
>>>>>     +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring
>>>>> *ring,
>>>>> +                unsigned int vmid, uint64_t pd_addr)
>>>>> +{
>>>>> +    struct amdgpu_vmhub *hub;
>>>>> +
>>>>> +    /* For VF, only local offsets should be used */
>>>>> +    if (amdgpu_sriov_vf(ring->adev))
>>>>> +        ring->vm_hub = AMDGPU_MMHUB0(0);
>>>> That is clearly a no-go since the vm_hub must be statically and can't be
>>>> changed here.
>>>>
>>> After HUB allocation, the only usage of this hub pointer is to calculate
>>> use the right offset. We still want VCN to use the right hub, only thing
>>> is register offsets in MMHUB(0) are equal to 'local offsets'.
>> The vm_hub is a static setup describing how the engine works. You
>> basically just insert an illegal value here to fix your register offset
>> calculation.
>>
>> That is absolutely *not* something you can do.
>>
>> What exactly is the requirement?
>>
> The requirement is this way -
>
> We have multiple MMHUBs and each VCN can talk only to the local MMHUB.
>
> The absolute register offsets we store for each for an example reg at
> offset x is
> 	HUB0 = x, HUB1 = x + Y (stride), HUB2 = x + 2Y and so forth.
>
> However VCN cannot use the absolute register offset in the packet in VF
> mode, instead it should use a local offset for the local HUB in VF mode.
> A local offset of a register in the HUB is exactly the same as offset of
> HUB0 = x.
>
> What we do here is assign the HUB as regular here
> 	ring->vm_hub = AMDGPU_MMHUB0(adev->vcn.inst[i].aid_id);
>
> so that amdgpu_gmc_allocate_vm_inv_eng is allocated in the right way.
>
> Then at the time of flush, switch it back to point to HUB0 so that it
> uses the offsets of HUB0 in the packet. Actually, it's not required to
> switch this every time, one-time is fine after
> amdgpu_gmc_allocate_vm_inv_eng() is complete.

Yeah, but that is not something we can do. vm_hub is a constant and 
should never be manipulated to fulfill a requirement like that.

If the VCN engines requires that we don't use the aid offsets in the 
VCN_ENC_CMD_REG_WRITE and VCN_ENC_CMD_REG_WAIT packets then we need to 
adjust those packets and not the vm_hub.

Also please sync up such with with Leo, he is responsible for 
coordinating all MM work on the kernel. In general the VCN team should 
never contact your team directly with such requirements.

Regards,
Christian.

>
> Thanks,
> Lijo
>
>> Regards,
>> Christian.
>>
>>> Thanks,
>>> Lijo
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> +    hub = &ring->adev->vmhub[ring->vm_hub];
>>>>> +
>>>>> +    pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
>>>>> +
>>>>> +    /* wait for reg writes */
>>>>> +    vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
>>>>> +                    vmid * hub->ctx_addr_distance,
>>>>> +                    lower_32_bits(pd_addr), 0xffffffff);
>>>>> +}
>>>>> +
>>>>>     static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
>>>>>     {
>>>>>         /* VCN engine access for HDP flush doesn't work when RRMT is
>>>>> enabled.
>>>>> @@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs
>>>>> vcn_v4_0_3_unified_ring_vm_funcs = {
>>>>>         .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
>>>>>         .emit_ib = vcn_v2_0_enc_ring_emit_ib,
>>>>>         .emit_fence = vcn_v2_0_enc_ring_emit_fence,
>>>>> -    .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
>>>>> +    .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
>>>>>         .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
>>>>>         .test_ring = amdgpu_vcn_enc_ring_test_ring,
>>>>>         .test_ib = amdgpu_vcn_unified_ring_test_ib,