[Intel-gfx] [PATCH 2/2] drm/i915: Allow compaction upto SWIOTLB max segment size
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Mon Oct 10 14:07:18 UTC 2016
On 10/10/2016 14:43, Chris Wilson wrote:
> On Mon, Oct 10, 2016 at 02:30:40PM +0100, Tvrtko Ursulin wrote:
>> On 10/10/2016 12:49, Chris Wilson wrote:
>>> commit 1625e7e549c5 ("drm/i915: make compact dma scatter lists creation
>>> work with SWIOTLB backend") took a heavy handed approach to undo the
>>> scatterlist compaction in the face of SWIOTLB. (The compaction hit a bug
>>> whereby we tried to pass a segment larger than SWIOTLB could handle.) We
>>> can be a little more intelligent and try compacting the scatterlist up
>>> to the maximum SWIOTLB segment size (when using SWIOTLB).
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> CC: Imre Deak <imre.deak at intel.com>
>>> CC: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem.c | 23 +++++++++++------------
>>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index ca1a5a5c6f19..8b3474d215a5 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2201,6 +2201,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>> struct sgt_iter sgt_iter;
>>> struct page *page;
>>> unsigned long last_pfn = 0; /* suppress gcc warning */
>>> + unsigned long max_segment;
>> unsigned int would be enough.
>>
> Current maximum object size >> PAGE_SHIFT is 36 bits. We don't impose
> any other restriction that would limit a sg chunk.
My bad, for some reason I thought it is used only under the
CONFIG_SWIOTLB guard.
>>> int ret;
>>> gfp_t gfp;
>>> @@ -2211,6 +2212,12 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>> GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
>>> GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>>> + max_segment = obj->base.size;
>>> +#ifdef CONFIG_SWIOTLB
>>> + if (swiotlb_nr_tbl())
>>> + max_segment = IO_TLB_SEGSIZE << PAGE_SHIFT;
>>> +#endif
>>> +
>> Do you want to use IS_ENABLED here?
> The symbol swiotlb_nr_tbl() is absent unless SWIOTLB is enabled at compile
> time. So we need the cpp guard, or do you mean switch to
> #if IS_ENABLED(CONFIG_SWIOTLB)
> which we probably should indeed.
Ok, doesn't matter then.
>
>>> st = kmalloc(sizeof(*st), GFP_KERNEL);
>>> if (st == NULL)
>>> return ERR_PTR(-ENOMEM);
>>> @@ -2252,15 +2259,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>> goto err_pages;
>>> }
>>> }
>>> -#ifdef CONFIG_SWIOTLB
>>> - if (swiotlb_nr_tbl()) {
>>> - st->nents++;
>>> - sg_set_page(sg, page, PAGE_SIZE, 0);
>>> - sg = sg_next(sg);
>>> - continue;
>>> - }
>>> -#endif
>>> - if (!i || page_to_pfn(page) != last_pfn + 1) {
>>> + if (!i ||
>>> + sg->length >= max_segment ||
>> I think this can overflow by a page, should be "sg->length >=
>> (max_segment - PAGE_SIZE)", or alternatively substract one page at
>> the max_segment assignment.
> We are looking at the previous sg, right? (and we only ever increment by
> PAGE_SIZE).
>
> So: when the previous sg reaches the maximum length, start a new sg
> element. Otherwise we extend the previous sg element by a PAGE, so on
> the else branch the maximum of sg->length after the increment is
> max_segment.
Yes I've corrected myself shortly after posting.
>
>>> + page_to_pfn(page) != last_pfn + 1) {
>>> if (i)
>>> sg = sg_next(sg);
>>> st->nents++;
>>> @@ -2273,9 +2274,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>>> /* Check that the i965g/gm workaround works. */
>>> WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
>>> }
>>> -#ifdef CONFIG_SWIOTLB
>>> - if (!swiotlb_nr_tbl())
>>> -#endif
>>> + if (st->nents < st->orig_nents)
>>> sg_mark_end(sg);
>> I wondered a few times that we could just terminate the table
>> unconditionally.
> The caveat being that if we do insert orig_nents, then sg at this point
> is NULL. Which is clearer:
>
> if (st->nents < st->orig_nents) sg_mark_end(sg);
>
> or
>
> if (sg) sg_mark_end(sg); /* coalesced sg table */
>
> ?
I missed that as well (that it can be NULL here). Sorry for the noise
then. "if (sg)" looks somewhat better to me FWIW.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list