[PATCH v2 2/4] drm/amdkfd: add kfd_device_info_init function

Felix Kuehling felix.kuehling at amd.com
Mon Nov 22 17:37:03 UTC 2021


Am 2021-11-22 um 10:25 a.m. schrieb Sider, Graham:
> [AMD Official Use Only]
>
>> -----Original Message-----
>> From: Kuehling, Felix <Felix.Kuehling at amd.com>
>> Sent: Friday, November 19, 2021 4:20 PM
>> To: Sider, Graham <Graham.Sider at amd.com>; amd-
>> gfx at lists.freedesktop.org
>> Cc: Deucher, Alexander <Alexander.Deucher at amd.com>
>> Subject: Re: [PATCH v2 2/4] drm/amdkfd: add kfd_device_info_init function
>>
>> On 2021-11-19 2:52 p.m., Graham Sider wrote:
>>> Initializes device_info structs given either asic_type (enum) if GFX
>>> version is less than GFX9, or GC IP version if greater. Also takes in
>>> vf and the target compiler gfx version.
>>>
>>> Inclusion/exclusion to certain conditions for certain GC IP versions
>>> may be necessary on npi bringup on a case-by-case basis, but for the
>>> most part should be minimal (e.g. adding one || asic_version ==
>> IP_VERSION(X ,X, X) case).
>>> Signed-off-by: Graham Sider <Graham.Sider at amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c | 61
>> +++++++++++++++++++++++++
>>>   1 file changed, 61 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index e11fc4e20c32..676cb9c3166c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -511,6 +511,67 @@ static void kfd_gtt_sa_fini(struct kfd_dev *kfd);
>>>
>>>   static int kfd_resume(struct kfd_dev *kfd);
>>>
>>> +static void kfd_device_info_init(struct kfd_dev *kfd,
>>> +				 struct kfd_device_info *device_info,
>>> +				 bool vf, uint32_t gfx_target_version)
>> This will give you a compile warning about an unused static function.
>> Maybe squash this with the commit that actually starts using this function.
>>
> Sounds good.
>
>>> +{
>>> +	uint32_t gc_version = KFD_GC_VERSION(kfd);
>>> +	uint32_t asic_type = kfd->adev->asic_type;
>>> +
>>> +	device_info->max_pasid_bits = 16;
>>> +	device_info->max_no_of_hqd = 24;
>>> +	device_info->num_of_watch_points = 4;
>>> +	device_info->mqd_size_aligned = MQD_SIZE_ALIGNED;
>>> +	device_info->gfx_target_version = gfx_target_version;
>>> +
>>> +	if (KFD_IS_SOC15(kfd)) {
>>> +		device_info->doorbell_size = 8;
>>> +		device_info->ih_ring_entry_size = 8 * sizeof(uint32_t);
>>> +		device_info->event_interrupt_class =
>> &event_interrupt_class_v9;
>>> +		device_info->supports_cwsr = true;
>>> +
>>> +		if ((gc_version >= IP_VERSION(9, 0, 1)  &&
>>> +		     gc_version <= IP_VERSION(9, 3, 0)) ||
>>> +		     gc_version == IP_VERSION(10, 3, 1) ||
>>> +		     gc_version == IP_VERSION(10, 3, 3))
>>> +			device_info->num_sdma_queues_per_engine = 2;
>>> +		else
>>> +			device_info->num_sdma_queues_per_engine = 8;
>> I feel this should be based on the SDMA IP version, not the GC IP version.
>>
> Can the SDMA queues/engine be determined by the SDMA IP versions? I would have thought those were instead done on a chip-by-chip basis. E.g. in amdgpu_discovery.c this is how the number of SDMA instances is defined.
>
>>> +
>>> +		/* Navi2x+, Navi1x+ */
>>> +		if (gc_version >= IP_VERSION(10, 3, 0))
>> There needs to be a maximum check here. This case should not automatically
>> apply to future ASICs e.g. GFX11.
>>
> Just a thought on this: assuming on future asics this field is going to continue to be populated, might it be better to just continue adding cases here as they arise? Adding a check for e.g. < GFX11, would require eventually bumping that check to < GFX12 alongside another check for >= GFX11. So at the end of the day, if a >= check is going to be needed anyway, is a maximum check necessary? Of course this wouldn't apply to below regarding the needs_pci_atomics bool, since as you mention on future asics it can be kept as defaulted to false.

The reason we had a firmware version check here is, because on these
ASICs older firmware depended on PCIe atomics, and at some version it
stopped depending on them.

On future ASICs I would expect all firmware versions to work without
PCIe atomics. So device_info->needs_pci_atomics would be set to "false"
for newer ASICs by default and you would not need a firmware version
check for them.

If we do need more firmware version checks for future generations, the
firmware versions will be different from current generations.  So you
would need to add new if-cases for those anyway. Firmware version 145
will be meaningless or plain wrong on GFX11 for instance.

Regards,
  Felix


>
>>> +			device_info->no_atomic_fw_version = 145;
>>> +		else if (gc_version >= IP_VERSION(10, 1, 1))
>>> +			device_info->no_atomic_fw_version = 92;
>>> +
>>> +		/* Raven */
>>> +		if (gc_version == IP_VERSION(9, 1, 0) ||
>>> +		    gc_version == IP_VERSION(9, 2, 2))
>>> +			device_info->needs_iommu_device = true;
>>> +
>>> +		/* Navi1x+ */
>>> +		if (gc_version >= IP_VERSION(10, 1, 1))
>> There needs to be a maximum check here. On future ASICs (maybe GFX11) I
>> would not expect atomics to be required.
>>
> See above, agreed here.
>
>> Regards,
>>    Felix
>>
> Best,
> Graham
>
>>> +			device_info->needs_pci_atomics = true;
>>> +	} else {
>>> +		device_info->doorbell_size = 4;
>>> +		device_info->ih_ring_entry_size = 4 * sizeof(uint32_t);
>>> +		device_info->event_interrupt_class =
>> &event_interrupt_class_cik;
>>> +		device_info->num_sdma_queues_per_engine = 2;
>>> +
>>> +		if (asic_type != CHIP_KAVERI &&
>>> +		    asic_type != CHIP_HAWAII &&
>>> +		    asic_type != CHIP_TONGA)
>>> +			device_info->supports_cwsr = true;
>>> +
>>> +		if (asic_type == CHIP_KAVERI ||
>>> +		    asic_type == CHIP_CARRIZO)
>>> +			device_info->needs_iommu_device = true;
>>> +
>>> +		if (asic_type != CHIP_HAWAII && !vf)
>>> +			device_info->needs_pci_atomics = true;
>>> +	}
>>> +}
>>> +
>>>   struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf)
>>>   {
>>>   	struct kfd_dev *kfd;


More information about the amd-gfx mailing list