[PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

Lazar, Lijo lijo.lazar at amd.com
Thu Sep 2 03:54:03 UTC 2021


Thanks Felix for the detailed explanation.

Thanks,
Lijo

On 9/1/2021 10:17 PM, Felix Kuehling wrote:
> Am 2021-09-01 um 12:30 p.m. schrieb Lazar, Lijo:
>>
>> [Public]
>>
>>
>> What I wanted to ask was -
>>
>> Whether user mode application relies only on link properties alone to
>> assume atomic ops are supported? If they check only link properties
>> and if the firmware doesn't work fine, should it be still marked as
>> supported?
> 
> Let's be clear what "firmware doesn't work fine" means in this context.
> It means "firmware requires PCIe atomics". If firmware requires PCIe
> atomics and the system doesn't support PCIe atomics, KFD will not use
> the GPU and will not report the GPU to user mode.
> 
> If firmware does not require PCIe atomics, or if PCIe atomics work on
> the system, KFD will use the GPU and will report the atomic capability
> to user mode in the IO link attribute.
> 
> 
>>
>> Basically, what is the purpose of exposing atomic capability in link
>> properties and whether that can be utilised by upper mode applications
>> just based on PCIe atomics support?
> 
> Applications can use PCIe atomics by using atomic shader instructions
> when accessing system memory in GPU shader code. If the system doesn't
> support PCIe atomics, these atomic operations are silently dropped.
> Therefore the application must check the atomic capability in the IO
> link properties before relying on these instructions for system memory.
> 
> Regards,
>    Felix
> 
> 
>>
>> Thanks,
>> Lijo
>> ------------------------------------------------------------------------
>> *From:* Kuehling, Felix <Felix.Kuehling at amd.com>
>> *Sent:* Wednesday, September 1, 2021 8:24:56 PM
>> *To:* Lazar, Lijo <Lijo.Lazar at amd.com>; amd-gfx at lists.freedesktop.org
>> <amd-gfx at lists.freedesktop.org>
>> *Subject:* Re: [PATCH 1/1] drm/amdkfd: make needs_pcie_atomics
>> FW-version dependent
>>   
>> Am 2021-09-01 um 7:04 a.m. schrieb Lazar, Lijo:
>>>
>>>
>>> On 9/1/2021 3:26 AM, Felix Kuehling wrote:
>>>> On some GPUs the PCIe atomic requirement for KFD depends on the MEC
>>>> firmware version. Add a firmware version check for this. The minimum
>>>> firmware version that works without atomics can be updated in the
>>>> device_info structure for each GPU type.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++++++--
>>>>    drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 +
>>>>    2 files changed, 8 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> index 16a57b70cc1a..655ee5733229 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> @@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>>        struct kfd_dev *kfd;
>>>>        const struct kfd_device_info *device_info;
>>>>        const struct kfd2kgd_calls *f2g;
>>>> +    uint32_t fw_version;
>>>>          if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void
>>>> *) * 2)
>>>>            || asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) {
>>>> @@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>>>>         * supported.
>>>>         */
>>>>        kfd->pci_atomic_requested =
>>>> amdgpu_amdkfd_have_atomics_support(kgd);
>>>
>>> Should the check be grouped inside amdgpu_amdkfd_have_atomics_support?
>>>
>>> This flag is used for setting some link properties. If there is HW
>>> support but comes with incompatible firmware, should the link be still
>>> marked as atomic?
>>
>> Our GPU HW always supports PCIe atomics (it's part of the PCIe 3 spec).
>> But some mainboards with older PCIe chipsets do not. Sometimes even
>> different ports on the same mainboard differ in their PCIe version and
>> atomic support.
>>
>> amdgpu_device_init always tries to enable atomics on the root port an
>> all the bridges leading to the GPU by calling
>> pci_enable_atomic_ops_to_root. The result is saved in
>> adev->have_atomics_support, which is returned to KFD by
>> amdgpu_amdkfd_have_atomics_support.
>>
>> The firmware change here does not affect whether atomics are
>> _supported_. It changes whether atomics are _required_ for the basic
>> operation of AQL user mode queues. The coming firmware update will
>> remove that requirement, which allows us to enable KFD for these GPUs+FW
>> on systems without PCIe atomics.
>>
>> Enabling PCIe atomics with the updated FW is still beneficial because
>> shader programs can use a subset of atomic instructions for accessing
>> system memory atomically on supported systems.
>>
>> Regards,
>>    Felix
>>
>>
>>>
>>> Thanks,
>>> Lijo
>>>
>>>> -    if (device_info->needs_pci_atomics &&
>>>> -        !kfd->pci_atomic_requested) {
>>>> +    fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1);
>>>> +    if (!kfd->pci_atomic_requested &&
>>>> +        device_info->needs_pci_atomics &&
>>>> +        (!device_info->no_atomic_fw_version ||
>>>> +          amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) <
>>>> +            device_info->no_atomic_fw_version)) {
>>>>            dev_info(kfd_device,
>>>>                 "skipped device %x:%x, PCI rejects atomics\n",
>>>>                 pdev->vendor, pdev->device);
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> index ab83b0de6b22..6d8f9bb2d905 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>> @@ -207,6 +207,7 @@ struct kfd_device_info {
>>>>        bool supports_cwsr;
>>>>        bool needs_iommu_device;
>>>>        bool needs_pci_atomics;
>>>> +    uint32_t no_atomic_fw_version;
>>>>        unsigned int num_sdma_engines;
>>>>        unsigned int num_xgmi_sdma_engines;
>>>>        unsigned int num_sdma_queues_per_engine;
>>>>


More information about the amd-gfx mailing list