[Intel-gfx] [PATCH 2/2] drm/i915: Allow compaction upto SWIOTLB max segment size

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Oct 10 14:07:18 UTC 2016


On 10/10/2016 14:43, Chris Wilson wrote:
> On Mon, Oct 10, 2016 at 02:30:40PM +0100, Tvrtko Ursulin wrote:
>> On 10/10/2016 12:49, Chris Wilson wrote:
>>> commit 1625e7e549c5 ("drm/i915: make compact dma scatter lists creation
>>> work with SWIOTLB backend") took a heavy handed approach to undo the
>>> scatterlist compaction in the face of SWIOTLB. (The compaction hit a bug
>>> whereby we tried to pass a segment larger than SWIOTLB could handle.) We
>>> can be a little more intelligent and try compacting the scatterlist up
>>> to the maximum SWIOTLB segment size (when using SWIOTLB).
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> CC: Imre Deak <imre.deak at intel.com>
>>> CC: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem.c | 23 +++++++++++------------
>>>   1 file changed, 11 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index ca1a5a5c6f19..8b3474d215a5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2201,6 +2201,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>>   	struct sgt_iter sgt_iter;
>>>   	struct page *page;
>>>   	unsigned long last_pfn = 0;	/* suppress gcc warning */
>>> +	unsigned long max_segment;
>> unsigned int would be enough.
>>
> Current maximum object size >> PAGE_SHIFT is 36 bits. We don't impose
> any other restriction that would limit a sg chunk.

My bad, for some reason I thought it is used only under the 
CONFIG_SWIOTLB guard.

>>>   	int ret;
>>>   	gfp_t gfp;
>>> @@ -2211,6 +2212,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>>   	GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
>>>   	GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>>> +	max_segment = obj->base.size;
>>> +#ifdef CONFIG_SWIOTLB
>>> +	if (swiotlb_nr_tbl())
>>> +		max_segment = IO_TLB_SEGSIZE << PAGE_SHIFT;
>>> +#endif
>>> +
>> Do you want to use IS_ENABLED here?
> The symbol swiotlb_nr_tbl() is absent unless SWIOTLB is enabled at compile
> time.  So we need the cpp guard, or do you mean switch to
> 	#if IS_ENABLED(CONFIG_SWIOTLB)
> which we probably should indeed.

Ok, doesn't matter then.

>   
>>>   	st = kmalloc(sizeof(*st), GFP_KERNEL);
>>>   	if (st == NULL)
>>>   		return ERR_PTR(-ENOMEM);
>>> @@ -2252,15 +2259,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>>   				goto err_pages;
>>>   			}
>>>   		}
>>> -#ifdef CONFIG_SWIOTLB
>>> -		if (swiotlb_nr_tbl()) {
>>> -			st->nents++;
>>> -			sg_set_page(sg, page, PAGE_SIZE, 0);
>>> -			sg = sg_next(sg);
>>> -			continue;
>>> -		}
>>> -#endif
>>> -		if (!i || page_to_pfn(page) != last_pfn + 1) {
>>> +		if (!i ||
>>> +		    sg->length >= max_segment ||
>> I think this can overflow by a page, should be "sg->length >=
>> (max_segment - PAGE_SIZE)", or alternatively substract one page at
>> the max_segment assignment.
> We are looking at the previous sg, right? (and we only ever increment by
> PAGE_SIZE).
>
> So: when the previous sg reaches the maximum length, start a new sg
> element. Otherwise we extend the previous sg element by a PAGE, so on
> the else branch the maximum of sg->length after the increment is
> max_segment.

Yes I've corrected myself shortly after posting.

>
>>> +		    page_to_pfn(page) != last_pfn + 1) {
>>>   			if (i)
>>>   				sg = sg_next(sg);
>>>   			st->nents++;
>>> @@ -2273,9 +2274,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>>   		/* Check that the i965g/gm workaround works. */
>>>   		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
>>>   	}
>>> -#ifdef CONFIG_SWIOTLB
>>> -	if (!swiotlb_nr_tbl())
>>> -#endif
>>> +	if (st->nents < st->orig_nents)
>>>   		sg_mark_end(sg);
>> I wondered a few times that we could just terminate the table
>> unconditionally.
> The caveat being that if we do insert orig_nents, then sg at this point
> is NULL. Which is clearer:
>
> 	if (st->nents < st->orig_nents) sg_mark_end(sg);
>
> or
>
> 	if (sg) sg_mark_end(sg); /* coalesced sg table */
>
> ?

I missed that as well (that it can be NULL here). Sorry for the noise 
then. "if (sg)" looks somewhat better to me FWIW.

Regards,

Tvrtko



More information about the Intel-gfx mailing list