[PATCH 1/2] drm/amdgpu: increase hmm range get pages timeout
James Zhu
jamesz at amd.com
Wed Dec 13 16:55:15 UTC 2023
On 2023-12-13 11:23, Felix Kuehling wrote:
>
> On 2023-12-13 10:24, James Zhu wrote:
>> Ping ...
>>
>> On 2023-12-08 18:01, James Zhu wrote:
>>> When application tries to allocate all system memory and cause memory
>>> to swap out. Needs more time for hmm_range_fault to validate the
>>> remaining page for allocation. To be safe, increase timeout value to
>>> 1 second for 64MB range.
>>>
>>> Signed-off-by: James Zhu <James.Zhu at amd.com>
>
> This is not the first time we're incrementing this timeout. Eventually
> we should get rid of that and find a way to make this work reliably
> without a timeout. There can always be situations where faults take
> longer, and we should not fail randomly in those cases.
>
> There are also some FIXMEs in this code that should be addressed at
> the same time.
>
> That said, as a short-term fix, this patch is
[JZ] Yes, it is just a short-term fix. the root cause is still under study,
>
> Acked-by: Felix Kuehling <Felix.Kuehling at amd.com>
>
>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>>> index 081267161d40..b24eb5821fd1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
>>> @@ -190,8 +190,8 @@ int amdgpu_hmm_range_get_pages(struct
>>> mmu_interval_notifier *notifier,
>>> pr_debug("hmm range: start = 0x%lx, end = 0x%lx",
>>> hmm_range->start, hmm_range->end);
>>> - /* Assuming 128MB takes maximum 1 second to fault page
>>> address */
>>> - timeout = max((hmm_range->end - hmm_range->start) >> 27, 1UL);
>>> + /* Assuming 64MB takes maximum 1 second to fault page
>>> address */
>>> + timeout = max((hmm_range->end - hmm_range->start) >> 26, 1UL);
>>> timeout *= HMM_RANGE_DEFAULT_TIMEOUT;
>>> timeout = jiffies + msecs_to_jiffies(timeout);
More information about the amd-gfx
mailing list