[PATCH] drm/amd/amdgpu: solve the issue of allocate continuous pages under xen dom0

Christian König christian.koenig at amd.com
Thu Sep 22 09:09:26 UTC 2022


Well this patch is a pretty clear NAK.

Using xen_swiotlb with userptr in the first place sounds illegal to me 
since we need coherent buffers which are incompatible with swiotlb.

In other words using userptrs in our driver stack are fundamentally 
incompatible with swiotlb, so it doesn't make sense to try o get this 
working using those workarounds.

So the real question is why are you trying to use userptr with XEN dom0? 
That won't work correctly at all.

Regards,
Christian.

Am 22.09.22 um 11:00 schrieb Xiao, Shane:
> [AMD Official Use Only - General]
>
> Hi Christian,
>
> Why should the resulting sg table be too large?
> [Shane]
> sg_alloc_table_from_pages will set max_segment field as default value UINT_MAX  to sg_alloc_table_from_pages_segment. The filed max_segment works as follows:
> “Contiguous ranges of the pages are squashed into a single scatterlist node up to the maximum size specified in @max_segment.”
> If we don't set the max_segment field, the sg_alloc_table_from_pages may allocate 2M or more continuous ranges of pages.
>
>
> For what too large?
> [Shane]
> However, these pages are called pseudo-physical pages on xen dom0, which means that the actual machine pages are not necessarily continuous.
> When this happens, the xen_swiotlb will use bounce buffer to do dma operation by swiotlb_tbl_map_single.
> But, the xen_swiotlb only allows IO_TLB_SEGSIZE*IO_TLB_SHIFT (256K) continuous pages, and the allocate 2M or more continuous ranges of pages will cause such error "swiotlb buffer is full".
>
> BTW: intel uses the same method to allocate page tables in i915_gem_userptr_get_pages.
>
> Best Regards,
> Shane
>
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig at amd.com>
>> Sent: Thursday, September 22, 2022 3:19 PM
>> To: Xiao, Shane <shane.xiao at amd.com>; amd-gfx at lists.freedesktop.org
>> Subject: Re: [PATCH] drm/amd/amdgpu: solve the issue of allocate
>> continuous pages under xen dom0
>>
>> Am 22.09.22 um 09:11 schrieb Shane Xiao:
>>> [Why]
>>> sg_alloc_table_from_pages alloc too large continuous PFN pages under
>> xen dom0.
>>
>> Well that sentence doesn't make much sense. Why should the resulting sg
>> table be to large? For what to large?
>>
>> Regards,
>> Christian.
>>
>>> However, xen should check continuous MFN pages in
>> range_straddles_page_boundary.
>>> When range_straddles_page_boundary return false, some cases fall back
>>> into swiotlb process and the continuous allocable page is not enough.
>>>
>>> [How]
>>> In fact, xen swiotlb set max_segment default value as UINT_MAX and
>>> xen_swiotlb_init_early already change the value to PAGE_SIZE under xen
>> dom0.
>>> However amdgpu driver doesn't use the value, which may cause issue
>>> such as swiotlb buffer full. Add amd_sg_segment_size according to
>>> iommu setting, the details are as follows:
>>> 	iommu setting		|	amd_sg_segment_size
>>> -------------------------------------------------------------------------------
>>> 	iommu=on		|	UINT_MAX
>>>       iommu=off && swiotlb on	|	IO_TLB_DEFAULT_SIZE(64M)
>>> 	xen_swiotlb on		|	PAGE_SIZE(4K)
>>> ----------------------------------------------------------------------
>>> ---------
>>>
>>> Signed-off-by: Shane Xiao <shane.xiao at amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 22
>> ++++++++++++++++++++--
>>>    1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 134575a3893c..d081fcd22d6b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -80,6 +80,23 @@ static int amdgpu_ttm_init_on_chip(struct
>> amdgpu_device *adev,
>>>    				  false, size_in_page);
>>>    }
>>>
>>> +static inline unsigned int amdgpu_sg_segment_size(void) {
>>> +	unsigned int size = swiotlb_max_segment();
>>> +
>>> +	/* size=0 when amd iommu enabled */
>>> +	if (size == 0)
>>> +		size = UINT_MAX;
>>> +
>>> +	size = rounddown(size, PAGE_SIZE);
>>> +	/* swiotlb_max_segment_size can return 1 byte when it means one
>> page. */
>>> +	if (size < PAGE_SIZE)
>>> +		size = PAGE_SIZE;
>>> +
>>> +	return size;
>>> +}
>>> +
>>> +
>>>    /**
>>>     * amdgpu_evict_flags - Compute placement flags
>>>     *
>>> @@ -760,9 +777,10 @@ static int amdgpu_ttm_tt_pin_userptr(struct
>> ttm_device *bdev,
>>>    	int r;
>>>
>>>    	/* Allocate an SG array and squash pages into it */
>>> -	r = sg_alloc_table_from_pages(ttm->sg, ttm->pages, ttm-
>>> num_pages, 0,
>>> -				      (u64)ttm->num_pages << PAGE_SHIFT,
>>> +	r = sg_alloc_table_from_pages_segment(ttm->sg, ttm->pages, ttm-
>>> num_pages, 0,
>>> +				      (u64)ttm->num_pages << PAGE_SHIFT,
>>> +amdgpu_sg_segment_size(),
>>>    				      GFP_KERNEL);
>>> +
>>>    	if (r)
>>>    		goto release_sg;
>>>



More information about the amd-gfx mailing list