[PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter
Felix Kuehling
felix.kuehling at amd.com
Wed Aug 28 22:43:39 UTC 2024
On 2024-08-28 17:38, Chen, Xiaogang wrote:
>
>
> On 8/28/2024 4:05 PM, Felix Kuehling wrote:
>>
>> On 2024-08-28 16:34, Chen, Xiaogang wrote:
>>>
>>>
>>> On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote:
>>>>
>>>> Responses inline
>>>>
>>>> Regards,
>>>>
>>>> Ramesh
>>>>
>>>> *From:*Chen, Xiaogang <Xiaogang.Chen at amd.com>
>>>> *Sent:* Wednesday, August 28, 2024 3:01 PM
>>>> *To:* Errabolu, Ramesh <Ramesh.Errabolu at amd.com>;
>>>> amd-gfx at lists.freedesktop.org
>>>> *Subject:* Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW
>>>> module parameter
>>>>
>>>> On 8/28/2024 2:52 PM, Errabolu, Ramesh wrote:
>>>>
>>>> Response inline
>>>>
>>>> Regards,
>>>>
>>>> Ramesh
>>>>
>>>>
>>>> -----Original Message-----
>>>>
>>>> From: Chen, Xiaogang<Xiaogang.Chen at amd.com>
>>>> <mailto:Xiaogang.Chen at amd.com>
>>>> Sent: Wednesday, August 28, 2024 2:43 PM
>>>>
>>>> To: Errabolu, Ramesh<Ramesh.Errabolu at amd.com>
>>>> <mailto:Ramesh.Errabolu at amd.com>;amd-gfx at lists.freedesktop.org
>>>>
>>>> Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW
>>>> module parameter
>>>>
>>>> Why need this driver parameter? kfd has
>>>> KFD_IOCTL_SVM_ATTR_GRANULARITY api that allows user space to set
>>>> migration granularity per prange. If both got set which will take
>>>> precedence?
>>>>
>>>> Ramesh: Use of Kfd Ioctl is available to users of registered
>>>> memory. It allows users to control GOBM per buffer level, including
>>>> overwriting default value. For ranges that do not specify GOBM, the
>>>> default value will be found.
>>>>
>>>> If user space use KFD_IOCTL_SVM_ATTR_GRANULARITY it will overwrite
>>>> this parameter value for a prange, then how to know which
>>>> granularity take effect? That is decided when user set this
>>>> parameter and when the api got used.
>>>>
>>>> Ramesh: The value bound by Kfd ioctl will take effect. In the life
>>>> cycle of a prange it can go from the default value to one that is
>>>> set by user via set_attr() call. However, it is generally
>>>> understood that that users of set_attr() will not call it directly
>>>> i.e. the rely on higher level apis from ROCr or HIP.
>>>>
>>> driver parameter can be set at run time, not only at boot time. It
>>> is not predictable when user set this driver parameter and when the
>>> api got called.
>>>
>> I don't think this is a problem. The module parameter determines the
>> granularity if the application doesn't set the virtual address range
>> attribute. The default is captured in the per-process svms structure.
>> So all mappings of the same process will use the same default, even
>> if the module parameter is changed after the process is started. The
>> get_attr ioctl will always return the actual granularity, no matter
>> whether it comes from the default or was overridden by user mode for
>> the virtual address range.
>
> My concern is there are two ways to set pragne's granularity, both can
> be used at run time. It can make confusion to know which one take
> effect as user can use driver parameter and api to change granularity
> with any timing.
>
But it's not "any timing". For the module parameter to take effect, it
has to be set _before_ the process starts. Any changes to the module
parameter after the process starts will not take effect.
Regards,
Felix
> Regards
>
> Xiaogang
>
>>
>> Regards,
>> Felix
>>
>>
>>> Regards
>>>
>>> Xiaogang
>>>
>>>> Regards
>>>>
>>>> Xiaogang
>>>>
>>>> Regards
>>>>
>>>> Xiaogang
>>>>
>>>> On 8/26/2024 2:34 PM, Ramesh Errabolu wrote:
>>>>
>>>> Caution: This message originated from an External Source.
>>>> Use proper caution when opening attachments, clicking links, or
>>>> responding.
>>>>
>>>> Enables users to update the default size of buffer used in
>>>> migration
>>>>
>>>> either from Sysmem to VRAM or vice versa.
>>>>
>>>> The param GOBM refers to granularity of buffer migration,
>>>> and is
>>>>
>>>> specified in terms of log(numPages(buffer)). It facilitates
>>>> users of
>>>>
>>>> unregistered memory to control GOBM, albeit at a coarse level
>>>>
>>>> Signed-off-by: Ramesh Errabolu<Ramesh.Errabolu at amd.com>
>>>> <mailto:Ramesh.Errabolu at amd.com>
>>>>
>>>> ---
>>>>
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 ++++
>>>>
>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18
>>>> +++++++++++++++++
>>>>
>>>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 12 ++++++++++++
>>>>
>>>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26
>>>> ++++++++++++++++---------
>>>>
>>>> 4 files changed, 51 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>> index e8c284aea1f2..73dd816b01f2 100644
>>>>
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>
>>>> @@ -237,6 +237,7 @@ extern int sched_policy;
>>>>
>>>> extern bool debug_evictions;
>>>>
>>>> extern bool no_system_mem_limit;
>>>>
>>>> extern int halt_if_hws_hang;
>>>>
>>>> +extern uint amdgpu_svm_attr_gobm;
>>>>
>>>> #else
>>>>
>>>> static const int __maybe_unused sched_policy =
>>>> KFD_SCHED_POLICY_HWS;
>>>>
>>>> static const bool __maybe_unused debug_evictions; /* =
>>>> false */ @@
>>>>
>>>> -313,6 +314,9 @@ extern int amdgpu_wbrf;
>>>>
>>>> /* Extra time delay(in ms) to eliminate the influence of
>>>> temperature momentary fluctuation */
>>>>
>>>> #define AMDGPU_SWCTF_EXTRA_DELAY 50
>>>>
>>>> +/* Default size of buffer to use in migrating buffer */
>>>>
>>>> +#define AMDGPU_SVM_ATTR_GOBM 9
>>>>
>>>> +
>>>>
>>>> struct amdgpu_xcp_mgr;
>>>>
>>>> struct amdgpu_device;
>>>>
>>>> struct amdgpu_irq_src;
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>> index b9529948f2b2..09c501753a3b 100644
>>>>
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>
>>>> @@ -169,6 +169,17 @@ uint amdgpu_sdma_phase_quantum = 32;
>>>>
>>>> char *amdgpu_disable_cu;
>>>>
>>>> char *amdgpu_virtual_display;
>>>>
>>>> bool enforce_isolation;
>>>>
>>>> +
>>>>
>>>> +/* Specifies the default size of buffer to use in
>>>>
>>>> + * migrating buffer from Sysmem to VRAM and vice
>>>>
>>>> + * versa
>>>>
>>>> + *
>>>>
>>>> + * GOBM - Granularity of Buffer Migration
>>>>
>>>> + *
>>>>
>>>> + * Defined as log2(sizeof(buffer)/PAGE_SIZE) */ uint
>>>>
>>>> +amdgpu_svm_attr_gobm = AMDGPU_SVM_ATTR_GOBM;
>>>>
>>>> +
>>>>
>>>> /*
>>>>
>>>> * OverDrive(bit 14) disabled by default
>>>>
>>>> * GFX DCS(bit 19) disabled by default @@ -320,6 +331,13 @@
>>>>
>>>> module_param_named(pcie_gen2, amdgpu_pcie_gen2, int, 0444);
>>>>
>>>> MODULE_PARM_DESC(msi, "MSI support (1 = enable, 0 =
>>>> disable, -1 = auto)");
>>>>
>>>> module_param_named(msi, amdgpu_msi, int, 0444);
>>>>
>>>> +/**
>>>>
>>>> + * DOC: svm_attr_gobm (uint)
>>>>
>>>> + * Size of buffer to use in migrating buffer from Sysmem
>>>> to VRAM and
>>>>
>>>> +vice versa */ MODULE_PARM_DESC(svm_attr_gobm, "Defined as
>>>>
>>>> +log2(sizeof(buffer)/PAGE_SIZE), e.g. 9 for 2 MiB");
>>>>
>>>> +module_param_named(svm_attr_gobm, amdgpu_svm_attr_gobm,
>>>> uint, 0644);
>>>>
>>>> +
>>>>
>>>> /**
>>>>
>>>> * DOC: lockup_timeout (string)
>>>>
>>>> * Set GPU scheduler timeout value in ms.
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>> index 9ae9abc6eb43..c2e54b18c167 100644
>>>>
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>>>
>>>> @@ -868,6 +868,18 @@ struct svm_range_list {
>>>>
>>>> struct task_struct *faulting_task;
>>>>
>>>> /* check point ts decides if page fault recovery
>>>> need be dropped */
>>>>
>>>> uint64_t checkpoint_ts[MAX_GPU_INSTANCE];
>>>>
>>>> +
>>>>
>>>> + /* Indicates the default size to use in migrating
>>>>
>>>> + * buffers of a process from Sysmem to VRAM and vice
>>>>
>>>> + * versa. The max legal value cannot be greater than
>>>>
>>>> + * 0x3F
>>>>
>>>> + *
>>>>
>>>> + * @note: A side effect of this symbol being part of
>>>>
>>>> + * struct svm_range_list is that it forces all buffers
>>>>
>>>> + * of the process of unregistered kind to use the same
>>>>
>>>> + * size in buffer migration
>>>>
>>>> + */
>>>>
>>>> + uint8_t attr_gobm;
>>>>
>>>> };
>>>>
>>>> /* Process data */
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>> index b44dec90969f..78c78baddb1f 100644
>>>>
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>>>>
>>>> @@ -309,12 +309,11 @@ static void svm_range_free(struct
>>>> svm_range *prange, bool do_unmap)
>>>>
>>>> }
>>>>
>>>> static void
>>>>
>>>> -svm_range_set_default_attributes(int32_t *location,
>>>> int32_t *prefetch_loc,
>>>>
>>>> - uint8_t *granularity,
>>>> uint32_t *flags)
>>>>
>>>> +svm_range_set_default_attributes(int32_t *location,
>>>>
>>>> + int32_t *prefetch_loc, uint32_t
>>>> *flags)
>>>>
>>>> {
>>>>
>>>> *location = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
>>>>
>>>> *prefetch_loc = KFD_IOCTL_SVM_LOCATION_UNDEFINED;
>>>>
>>>> - *granularity = 9;
>>>>
>>>> *flags =
>>>>
>>>> KFD_IOCTL_SVM_FLAG_HOST_ACCESS |
>>>> KFD_IOCTL_SVM_FLAG_COHERENT;
>>>>
>>>> }
>>>>
>>>> @@ -358,9 +357,9 @@ svm_range *svm_range_new(struct
>>>> svm_range_list *svms, uint64_t start,
>>>>
>>>> bitmap_copy(prange->bitmap_access, svms->bitmap_supported,
>>>>
>>>> MAX_GPU_INSTANCE);
>>>>
>>>> + prange->granularity = svms->attr_gobm;
>>>>
>>>> svm_range_set_default_attributes(&prange->preferred_loc,
>>>>
>>>> - &prange->prefetch_loc,
>>>>
>>>> - &prange->granularity, &prange->flags);
>>>>
>>>> + &prange->prefetch_loc,
>>>>
>>>> + &prange->flags);
>>>>
>>>> pr_debug("svms 0x%p [0x%llx 0x%llx]\n", svms,
>>>> start, last);
>>>>
>>>> @@ -2693,10 +2692,12 @@ svm_range_get_range_boundaries(struct
>>>>
>>>> kfd_process *p, int64_t addr,
>>>>
>>>> *is_heap_stack = vma_is_initial_heap(vma) ||
>>>>
>>>> vma_is_initial_stack(vma);
>>>>
>>>> + /* Determine the starting and ending page of prange */
>>>>
>>>> start_limit = max(vma->vm_start >> PAGE_SHIFT,
>>>>
>>>> - (unsigned long)ALIGN_DOWN(addr, 2UL
>>>> << 8));
>>>>
>>>> + (unsigned long)ALIGN_DOWN(addr, 1 <<
>>>>
>>>> + p->svms.attr_gobm));
>>>>
>>>> end_limit = min(vma->vm_end >> PAGE_SHIFT,
>>>>
>>>> - (unsigned long)ALIGN(addr + 1, 2UL << 8));
>>>>
>>>> + (unsigned long)ALIGN(addr + 1, 1 <<
>>>>
>>>> + p->svms.attr_gobm));
>>>>
>>>> +
>>>>
>>>> /* First range that starts after the fault
>>>> address */
>>>>
>>>> node = interval_tree_iter_first(&p->svms.objects,
>>>> addr + 1, ULONG_MAX);
>>>>
>>>> if (node) {
>>>>
>>>> @@ -3240,6 +3241,12 @@ int svm_range_list_init(struct
>>>> kfd_process *p)
>>>>
>>>> if
>>>> (KFD_IS_SVM_API_SUPPORTED(p->pdds[i]->dev->adev))
>>>>
>>>> bitmap_set(svms->bitmap_supported, i, 1);
>>>>
>>>> + /* Bind granularity of buffer migration, either
>>>>
>>>> + * the default size or one specified by the user
>>>>
>>>> + */
>>>>
>>>> + svms->attr_gobm = min_t(u8, amdgpu_svm_attr_gobm,
>>>> 0x3F);
>>>>
>>>> + pr_debug("Granularity Of Buffer Migration: %d\n",
>>>>
>>>> + svms->attr_gobm);
>>>>
>>>> +
>>>>
>>>> return 0;
>>>>
>>>> }
>>>>
>>>> @@ -3767,8 +3774,9 @@ svm_range_get_attr(struct kfd_process
>>>> *p, struct mm_struct *mm,
>>>>
>>>> node = interval_tree_iter_first(&svms->objects,
>>>> start, last);
>>>>
>>>> if (!node) {
>>>>
>>>> pr_debug("range attrs not found return
>>>> default values\n");
>>>>
>>>> - svm_range_set_default_attributes(&location, &prefetch_loc,
>>>>
>>>> - &granularity, &flags_and);
>>>>
>>>> + granularity = svms->attr_gobm;
>>>>
>>>> + svm_range_set_default_attributes(&location,
>>>>
>>>> + &prefetch_loc, &flags_and);
>>>>
>>>> flags_or = flags_and;
>>>>
>>>> if (p->xnack_enabled)
>>>>
>>>> bitmap_copy(bitmap_access,
>>>>
>>>> svms->bitmap_supported,
>>>>
>>>> --
>>>>
>>>> 2.34.1
>>>>
More information about the amd-gfx
mailing list