[Intel-gfx] [PATCH rdma-next v4 4/4] RDMA/umem: Move to allocate SG table from pages

Wed Sep 30 15:05:15 UTC 2020

On 9/30/2020 2:58 PM, Jason Gunthorpe wrote:
> On Wed, Sep 30, 2020 at 02:53:58PM +0300, Maor Gottlieb wrote:
>> On 9/30/2020 2:45 PM, Jason Gunthorpe wrote:
>>> On Wed, Sep 30, 2020 at 12:53:21PM +0300, Leon Romanovsky wrote:
>>>> On Tue, Sep 29, 2020 at 04:59:29PM -0300, Jason Gunthorpe wrote:
>>>>> On Sun, Sep 27, 2020 at 09:46:47AM +0300, Leon Romanovsky wrote:
>>>>>> @@ -296,11 +223,17 @@ static struct ib_umem *__ib_umem_get(struct ib_device *device,
>>>>>>    			goto umem_release;
>>>>>>
>>>>>>    		cur_base += ret * PAGE_SIZE;
>>>>>> -		npages   -= ret;
>>>>>> -
>>>>>> -		sg = ib_umem_add_sg_table(sg, page_list, ret,
>>>>>> -			dma_get_max_seg_size(device->dma_device),
>>>>>> -			&umem->sg_nents);
>>>>>> +		npages -= ret;
>>>>>> +		sg = __sg_alloc_table_from_pages(
>>>>>> +			&umem->sg_head, page_list, ret, 0, ret << PAGE_SHIFT,
>>>>>> +			dma_get_max_seg_size(device->dma_device), sg, npages,
>>>>>> +			GFP_KERNEL);
>>>>>> +		umem->sg_nents = umem->sg_head.nents;
>>>>>> +		if (IS_ERR(sg)) {
>>>>>> +			unpin_user_pages_dirty_lock(page_list, ret, 0);
>>>>>> +			ret = PTR_ERR(sg);
>>>>>> +			goto umem_release;
>>>>>> +		}
>>>>>>    	}
>>>>>>
>>>>>>    	sg_mark_end(sg);
>>>>> Does it still need the sg_mark_end?
>>>> It is preserved here for correctness, the release logic doesn't rely on
>>>> this marker, but it is better to leave it.
>>> I mean, my read of __sg_alloc_table_from_pages() is that it already
>>> placed it, the final __alloc_table() does it?
>> It marks the last allocated sge, but not the last populated sge (with page).
> Why are those different?
>
> It looks like the last iteration calls __alloc_table() with an exact
> number of sges
>
> +	if (!prv) {
> +		/* Only the last allocation could be less than the maximum */
> +		table_size = left_pages ? SG_MAX_SINGLE_ALLOC : chunks;
> +		ret = sg_alloc_table(sgt, table_size, gfp_mask);
> +		if (unlikely(ret))
> +			return ERR_PTR(ret);
> +	}
>
> Jason

This is right only for the last iteration. E.g. in the first iteration 
in case that there are more pages (left_pages), then we allocate 
SG_MAX_SINGLE_ALLOC.  We don't know how many pages from the second 
iteration will be squashed to the SGE from the first iteration.