[PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

Felix Kuehling felix.kuehling at amd.com
Wed Aug 28 22:43:39 UTC 2024


On 2024-08-28 17:38, Chen, Xiaogang wrote:
>
>
> On 8/28/2024 4:05 PM, Felix Kuehling wrote:
>>
>> On 2024-08-28 16:34, Chen, Xiaogang wrote:
>>>
>>>
>>> On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote:
>>>>
>>>> Responses inline
>>>>
>>>> Regards,
>>>>
>>>> Ramesh
>>>>
>>>> *From:*Chen, Xiaogang <Xiaogang.Chen at amd.com>
>>>> *Sent:* Wednesday, August 28, 2024 3:01 PM
>>>> *To:* Errabolu, Ramesh <Ramesh.Errabolu at amd.com>; 
>>>> amd-gfx at lists.freedesktop.org
>>>> *Subject:* Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW 
>>>> module parameter
>>>>
>>>> On 8/28/2024 2:52 PM, Errabolu, Ramesh wrote:
>>>>
>>>>     Response inline
>>>>
>>>>     Regards,
>>>>
>>>>     Ramesh
>>>>
>>>>
>>>>     -----Original Message-----
>>>>
>>>>     From: Chen, Xiaogang<Xiaogang.Chen at amd.com> 
>>>> <mailto:Xiaogang.Chen at amd.com>
>>>>     Sent: Wednesday, August 28, 2024 2:43 PM
>>>>
>>>>     To: Errabolu, Ramesh<Ramesh.Errabolu at amd.com> 
>>>> <mailto:Ramesh.Errabolu at amd.com>;amd-gfx at lists.freedesktop.org
>>>>
>>>>     Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW 
>>>> module parameter
>>>>
>>>>     Why need this driver parameter? kfd has 
>>>> KFD_IOCTL_SVM_ATTR_GRANULARITY api that allows user space to set 
>>>> migration granularity per prange. If both got set which will take 
>>>> precedence?
>>>>
>>>>     Ramesh: Use of Kfd Ioctl is available to users of registered 
>>>> memory. It allows users to control GOBM per buffer level, including 
>>>> overwriting default value. For ranges that do not specify GOBM, the 
>>>> default value will be found.
>>>>
>>>> If user space use KFD_IOCTL_SVM_ATTR_GRANULARITY it will overwrite 
>>>> this parameter value for a prange, then how to know which 
>>>> granularity take effect? That is decided when user set this 
>>>> parameter and when the api got used.
>>>>
>>>> Ramesh: The value bound by Kfd ioctl will take effect. In the life 
>>>> cycle of a prange it can go from the default value to one that is 
>>>> set by user via set_attr() call. However, it is generally 
>>>> understood that that users of set_attr() will not call it directly 
>>>> i.e. the rely on higher level apis from ROCr or HIP.
>>>>
>>> driver parameter can be set at run time, not only at boot time. It 
>>> is not predictable  when user set this driver parameter and when the 
>>> api got called.
>>>
>> I don't think this is a problem. The module parameter determines the 
>> granularity if the application doesn't set the virtual address range 
>> attribute. The default is captured in the per-process svms structure. 
>> So all mappings of the same process will use the same default, even 
>> if the module parameter is changed after the process is started. The 
>> get_attr ioctl will always return the actual granularity, no matter 
>> whether it comes from the default or was overridden by user mode for 
>> the virtual address range.
>
> My concern is there are two ways to set pragne's granularity, both can 
> be used at run time. It can make confusion to know which one take 
> effect as user can use driver parameter and api to change granularity 
> with any timing.
>
But it's not "any timing". For the module parameter to take effect, it 
has to be set _before_ the process starts. Any changes to the module 
parameter after the process starts will not take effect.

Regards,
   Felix


> Regards
>
> Xiaogang
>
>>
>> Regards,
>>   Felix
>>
>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>>> Regards
>>>>
>>>> Xiaogang
>>>>
>>>>     Regards
>>>>
>>>>     Xiaogang
>>>>
>>>>     On 8/26/2024 2:34 PM, Ramesh Errabolu wrote:
>>>>
>>>>         Caution: This message originated from an External Source. 
>>>> Use proper caution when opening attachments, clicking links, or 
>>>> responding.
>>>>
>>>>         Enables users to update the default size of buffer used in 
>>>> migration
>>>>
>>>>         either from Sysmem to VRAM or vice versa.
>>>>
>>>>         The param GOBM refers to granularity of buffer migration, 
>>>> and is
>>>>
>>>>         specified in terms of log(numPages(buffer)). It facilitates 
>>>> users of
>>>>
>>>>         unregistered memory to control GOBM, albeit at a coarse level
>>>>
>>>>         Signed-off-by: Ramesh Errabolu<Ramesh.Errabolu at amd.com> 
>>>> <mailto:Ramesh.Errabolu at amd.com>
>>>>
>>>>         ---
>>>>
>>>>            drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  4 ++++
>>>>
>>>>            drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18 
>>>> +++++++++++++++++
>>>>
>>>>            drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 12 ++++++++++++
>>>>
>>>>            drivers/gpu/drm/amd/amdkfd/kfd_svm.c    | 26 
>>>> ++++++++++++++++---------
>>>>
>>>>            4 files changed, 51 insertions(+), 9 deletions(-)
>>>>
>>>>         diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>>         b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>>         index e8c284aea1f2..73dd816b01f2 100644
>>>>
>>>>         --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>>         +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>>         @@ -237,6 +237,7 @@ extern int sched_policy;
>>>>
>>>>            extern bool debug_evictions;
>>>>
>>>>            extern bool no_system_mem_limit;
>>>>
>>>>            extern int halt_if_hws_hang;
>>>>
>>>>         +extern uint amdgpu_svm_attr_gobm;
>>>>
>>>>            #else
>>>>
>>>>            static const int __maybe_unused sched_policy = 
>>>> KFD_SCHED_POLICY_HWS;
>>>>
>>>>            static const bool __maybe_unused debug_evictions; /* = 
>>>> false */ @@
>>>>
>>>>         -313,6 +314,9 @@ extern int amdgpu_wbrf;
>>>>
>>>>            /* Extra time delay(in ms) to eliminate the influence of 
>>>> temperature momentary fluctuation */
>>>>
>>>>            #define AMDGPU_SWCTF_EXTRA_DELAY               50
>>>>
>>>>         +/* Default size of buffer to use in migrating buffer */
>>>>
>>>>         +#define AMDGPU_SVM_ATTR_GOBM       9
>>>>
>>>>         +
>>>>
>>>>            struct amdgpu_xcp_mgr;
>>>>
>>>>            struct amdgpu_device;
>>>>
>>>>            struct amdgpu_irq_src;
>>>>
>>>>         diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>>         b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>>         index b9529948f2b2..09c501753a3b 100644
>>>>
>>>>         --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>>         +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>>         @@ -169,6 +169,17 @@ uint amdgpu_sdma_phase_quantum = 32;
>>>>
>>>>            char *amdgpu_disable_cu;
>>>>
>>>>            char *amdgpu_virtual_display;
>>>>
>>>>            bool enforce_isolation;
>>>>
>>>>         +
>>>>
>>>>         +/* Specifies the default size of buffer to use in
>>>>
>>>>         + * migrating buffer from Sysmem to VRAM and vice
>>>>
>>>>         + * versa
>>>>
>>>>         + *
>>>>
>>>>         + * GOBM - Granularity of Buffer Migration
>>>>
>>>>         + *
>>>>
>>>>         + * Defined as log2(sizeof(buffer)/PAGE_SIZE)  */ uint
>>>>
>>>>         +amdgpu_svm_attr_gobm = AMDGPU_SVM_ATTR_GOBM;
>>>>
>>>>         +
>>>>
>>>>            /*
>>>>
>>>>             * OverDrive(bit 14) disabled by default
>>>>
>>>>             * GFX DCS(bit 19) disabled by default @@ -320,6 +331,13 @@
>>>>
>>>>         module_param_named(pcie_gen2, amdgpu_pcie_gen2, int, 0444);
>>>>
>>>>            MODULE_PARM_DESC(msi, "MSI support (1 = enable, 0 = 
>>>> disable, -1 = auto)");
>>>>
>>>>            module_param_named(msi, amdgpu_msi, int, 0444);
>>>>
>>>>         +/**
>>>>
>>>>         + * DOC: svm_attr_gobm (uint)
>>>>
>>>>         + * Size of buffer to use in migrating buffer from Sysmem 
>>>> to VRAM and
>>>>
>>>>         +vice versa  */ MODULE_PARM_DESC(svm_attr_gobm, "Defined as
>>>>
>>>>         +log2(sizeof(buffer)/PAGE_SIZE), e.g. 9 for 2 MiB");
>>>>
>>>>         +module_param_named(svm_attr_gobm, amdgpu_svm_attr_gobm, 
>>>> uint, 0644);
>>>>
>>>>         +
>>>>
>>>>            /**
>>>>
>>>>             * DOC: lockup_timeout (string)
>>>>
>>>>             * Set GPU scheduler timeout value in ms.
>>>>
>>>>         diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>>         b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>>         index 9ae9abc6eb43..c2e54b18c167 100644
>>>>
>>>>         --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>>         +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>>         @@ -868,6 +868,18 @@ struct svm_range_list {
>>>>
>>>>                   struct task_struct *faulting_task;
>>>>
>>>>                   /* check point ts decides if page fault recovery 
>>>> need be dropped */
>>>>
>>>>                   uint64_t checkpoint_ts[MAX_GPU_INSTANCE];
>>>>
>>>>         +
>>>>
>>>>         +       /* Indicates the default size to use in migrating
>>>>
>>>>         +        * buffers of a process from Sysmem to VRAM and vice
>>>>
>>>>         +        * versa. The max legal value cannot be greater than
>>>>
>>>>         +        * 0x3F
>>>>
>>>>         +        *
>>>>
>>>>         +        * @note: A side effect of this symbol being part of
>>>>
>>>>         +        * struct svm_range_list is that it forces all buffers
>>>>
>>>>         +        * of the process of unregistered kind to use the same
>>>>
>>>>         +        * size in buffer migration
>>>>
>>>>         +        */
>>>>
>>>>         +       uint8_t attr_gobm;
>>>>
>>>>            };
>>>>
>>>>            /* Process data */
>>>>
>>>>         diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>>         b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>>         index b44dec90969f..78c78baddb1f 100644
>>>>
>>>>         --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>>         +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>>         @@ -309,12 +309,11 @@ static void svm_range_free(struct 
>>>> svm_range *prange, bool do_unmap)
>>>>
>>>>            }
>>>>
>>>>            static void
>>>>
>>>>         -svm_range_set_default_attributes(int32_t *location, 
>>>> int32_t *prefetch_loc,
>>>>
>>>>         -                                uint8_t *granularity, 
>>>> uint32_t *flags)
>>>>
>>>>         +svm_range_set_default_attributes(int32_t *location,
>>>>
>>>>         +                       int32_t *prefetch_loc, uint32_t 
>>>> *flags)
>>>>
>>>>            {
>>>>
>>>>                   *location = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
>>>>
>>>>                   *prefetch_loc = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
>>>>
>>>>         -       *granularity = 9;
>>>>
>>>>                   *flags =
>>>>
>>>>                           KFD_IOCTL_SVM_FLAG_HOST_ACCESS | 
>>>> KFD_IOCTL_SVM_FLAG_COHERENT;
>>>>
>>>>            }
>>>>
>>>>         @@ -358,9 +357,9 @@ svm_range *svm_range_new(struct 
>>>> svm_range_list *svms, uint64_t start,
>>>>
>>>> bitmap_copy(prange->bitmap_access, svms->bitmap_supported,
>>>>
>>>>                                       MAX_GPU_INSTANCE);
>>>>
>>>>         +       prange->granularity = svms->attr_gobm;
>>>>
>>>> svm_range_set_default_attributes(&prange->preferred_loc,
>>>>
>>>>         - &prange->prefetch_loc,
>>>>
>>>>         - &prange->granularity, &prange->flags);
>>>>
>>>>         + &prange->prefetch_loc,
>>>>
>>>>         + &prange->flags);
>>>>
>>>>                   pr_debug("svms 0x%p [0x%llx 0x%llx]\n", svms, 
>>>> start, last);
>>>>
>>>>         @@ -2693,10 +2692,12 @@ svm_range_get_range_boundaries(struct
>>>>
>>>>         kfd_process *p, int64_t addr,
>>>>
>>>>                   *is_heap_stack = vma_is_initial_heap(vma) ||
>>>>
>>>>         vma_is_initial_stack(vma);
>>>>
>>>>         +       /* Determine the starting and ending page of prange */
>>>>
>>>>                   start_limit = max(vma->vm_start >> PAGE_SHIFT,
>>>>
>>>>         -                     (unsigned long)ALIGN_DOWN(addr, 2UL 
>>>> << 8));
>>>>
>>>>         +                     (unsigned long)ALIGN_DOWN(addr, 1 <<
>>>>
>>>>         + p->svms.attr_gobm));
>>>>
>>>>                   end_limit = min(vma->vm_end >> PAGE_SHIFT,
>>>>
>>>>         -                   (unsigned long)ALIGN(addr + 1, 2UL << 8));
>>>>
>>>>         +                   (unsigned long)ALIGN(addr + 1, 1 <<
>>>>
>>>>         + p->svms.attr_gobm));
>>>>
>>>>         +
>>>>
>>>>                   /* First range that starts after the fault 
>>>> address */
>>>>
>>>>                   node = interval_tree_iter_first(&p->svms.objects, 
>>>> addr + 1, ULONG_MAX);
>>>>
>>>>                   if (node) {
>>>>
>>>>         @@ -3240,6 +3241,12 @@ int svm_range_list_init(struct 
>>>> kfd_process *p)
>>>>
>>>>                           if 
>>>> (KFD_IS_SVM_API_SUPPORTED(p->pdds[i]->dev->adev))
>>>>
>>>> bitmap_set(svms->bitmap_supported, i, 1);
>>>>
>>>>         +       /* Bind granularity of buffer migration, either
>>>>
>>>>         +        * the default size or one specified by the user
>>>>
>>>>         +        */
>>>>
>>>>         +       svms->attr_gobm = min_t(u8, amdgpu_svm_attr_gobm, 
>>>> 0x3F);
>>>>
>>>>         +       pr_debug("Granularity Of Buffer Migration: %d\n",
>>>>
>>>>         + svms->attr_gobm);
>>>>
>>>>         +
>>>>
>>>>                   return 0;
>>>>
>>>>            }
>>>>
>>>>         @@ -3767,8 +3774,9 @@ svm_range_get_attr(struct kfd_process 
>>>> *p, struct mm_struct *mm,
>>>>
>>>>                   node = interval_tree_iter_first(&svms->objects, 
>>>> start, last);
>>>>
>>>>                   if (!node) {
>>>>
>>>>                           pr_debug("range attrs not found return 
>>>> default values\n");
>>>>
>>>>         - svm_range_set_default_attributes(&location, &prefetch_loc,
>>>>
>>>>         - &granularity, &flags_and);
>>>>
>>>>         +               granularity = svms->attr_gobm;
>>>>
>>>>         + svm_range_set_default_attributes(&location,
>>>>
>>>>         +     &prefetch_loc, &flags_and);
>>>>
>>>>                           flags_or = flags_and;
>>>>
>>>>                           if (p->xnack_enabled)
>>>>
>>>>                                   bitmap_copy(bitmap_access,
>>>>
>>>>         svms->bitmap_supported,
>>>>
>>>>         --
>>>>
>>>>         2.34.1
>>>>


More information about the amd-gfx mailing list