[Intel-gfx] [PATCH v6 3/7] drm/i915: introduce and use i915_gem_object_vmap_range()
Dave Gordon
david.s.gordon at intel.com
Tue Mar 1 13:13:29 UTC 2016
On 01/03/16 10:12, Tvrtko Ursulin wrote:
>
> On 29/02/16 21:16, Dave Gordon wrote:
>> From: Alex Dai <yu.dai at intel.com>
>>
>> There are several places inside driver where a GEM object is mapped
>> to kernel virtual space. The mapping may be done either for the whole
>> object or only a subset of it.
>>
>> This patch introduces a function i915_gem_object_vmap_range() to
>> implement the common functionality. The code itself is extracted and
>> adapted from that in vmap_batch(), but also replaces vmap_obj() and the
>> open-coded version in i915_gem_dmabuf_vmap().
>>
>> v2: use obj->pages->nents for iteration within i915_gem_object_vmap;
>> break when it finishes all desired pages. The caller must pass the
>> actual page count required. [Tvrtko Ursulin]
>>
>> v4: renamed to i915_gem_object_vmap_range() to make its function
>> clearer. [Dave Gordon]
>>
>> v5: use Chris Wilson's new drm_malloc_gfp() rather than kmalloc() or
>> drm_malloc_ab(). [Dave Gordon]
>>
>> v6: changed range checking to not use pages->nents. [Tvrtko Ursulin]
>> Use sg_nents_for_len() for range check instead. [Dave Gordon]
>> Pass range parameters in bytes rather than pages (both callers
>> were converting from bytes to pages anyway, so this reduces the
>> number of places where the conversion is done).
>>
>> With this change, we have only one vmap() in the whole driver :)
>>
>> Signed-off-by: Alex Dai <yu.dai at intel.com>
>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>> ---
[snip]
>> +/**
>> + * i915_gem_object_vmap_range - map some or all of a GEM object into
>> kernel space
>> + * @obj: the GEM object to be mapped
>> + * @start: offset in bytes of the start of the range to be mapped
>> + * @len: length in bytes of the range to be mapped
>
> kbuild spotted this kerneldoc issue.
Ah yes, that parameter got renamed
>> + *
>> + * Map a given range of a GEM object into kernel virtual space. The
>> range will
>> + * be extended at both ends, if necessary, to span a whole number of
>> pages. The
>> + * caller must make sure the associated pages are gathered and pinned
>> before
>> + * calling this function, and is responsible for unmapping the
>> returned address
>> + * when it is no longer required.
>> + *
>> + * Returns the address at which the object has been mapped, or NULL
>> on failure.
>> + */
>> +void *i915_gem_object_vmap_range(struct drm_i915_gem_object *obj,
>> + unsigned long start,
>> + unsigned long nbytes)
>> +{
>> + struct scatterlist *sg = obj->pages->sgl;
>> + struct sg_page_iter sg_iter;
>> + struct page **pages;
>> + unsigned long first, npages, i;
>> + int nents;
>> + void *addr;
>> +
>> + /* Check requested range against underlying sg list */
>> + nents = sg_nents_for_len(sg, start + nbytes);
>> + if (nents < 0) {
>> + DRM_DEBUG_DRIVER("Invalid page count\n");
>> + return NULL;
>> + }
>
> I think this is needless overhead. The helper will iterate the whole sg
> chain while we know the size in obj->base.size and finding out the real
> nents is of little (no) use to the code below.
It was more of a consistency check between the GEM object on the one
hand, and the SGlist implementation on the other; since none of the
other SG interfaces actually work in any useful quantities (e.g. pages
or bytes; 'nents' is particularly useless, as it is affected by the
details of the way the underlying pages have been mapped, in particular
how many separate chunks they're in).
>> +
>> + /* Work in pages from now on */
>> + first = start >> PAGE_SHIFT;
>> + npages = DIV_ROUND_UP(start + nbytes, PAGE_SIZE) - first;
>
> And this looks like weak API if the caller can pass non page aligned
> start and size and the function will silently vmap something else.
>
> It should assert and fail on both I think, or it may have been simpler
> to keep it working in page units.
It is exactly the API required by copy_batch() (which is where this code
came from), and it's a perfectly well-defined API: the start offset is
rounded down to the start of the containing page, the end address (start
+ len) is rounded up to the next page boundary, and that defines what
gets mapped i.e. the smallest page-aligned region containing the range
specified.
But I could let (all) the callers do that conversion instead ...
>> + pages = drm_malloc_gfp(npages, sizeof(*pages), GFP_TEMPORARY);
>> + if (pages == NULL) {
>> + DRM_DEBUG_DRIVER("Failed to get space for pages\n");
>> + return NULL;
>> + }
>> +
>> + i = 0;
>> + for_each_sg_page(sg, &sg_iter, nents, first) {
>> + pages[i] = sg_page_iter_page(&sg_iter);
>> + if (++i >= npages)
>> + break;
>> + }
>> + WARN_ON(i != npages);
>> +
>> + addr = vmap(pages, npages, 0, PAGE_KERNEL);
>> + if (addr == NULL)
>> + DRM_DEBUG_DRIVER("Failed to vmap pages\n");
>> + drm_free_large(pages);
>> +
>> + return addr;
>> +}
>> +
>> void i915_vma_move_to_active(struct i915_vma *vma,
>> struct drm_i915_gem_request *req)
>> {
>> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> index 1f3eef6..aee4149 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
>> @@ -110,9 +110,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf
>> *dma_buf)
>> {
>> struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
>> struct drm_device *dev = obj->base.dev;
>> - struct sg_page_iter sg_iter;
>> - struct page **pages;
>> - int ret, i;
>> + int ret;
>>
>> ret = i915_mutex_lock_interruptible(dev);
>> if (ret)
>> @@ -131,16 +129,8 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf
>> *dma_buf)
>>
>> ret = -ENOMEM;
>>
>> - pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
>> - if (pages == NULL)
>> - goto err_unpin;
>> -
>> - i = 0;
>> - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
>> - pages[i++] = sg_page_iter_page(&sg_iter);
>> -
>> - obj->dma_buf_vmapping = vmap(pages, i, 0, PAGE_KERNEL);
>> - drm_free_large(pages);
>> + obj->dma_buf_vmapping = i915_gem_object_vmap_range(obj, 0,
>> + dma_buf->size >> PAGE_SHIFT);
>
> This is still in pages. (Although as said below I think it should remain
> and API be reverted back.)
Ah yes, I missed this call and the one below, 'cos they get fixed up by
the last patch of the series (which *does* adopt the new interface).
>> if (!obj->dma_buf_vmapping)
>> goto err_unpin;
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 45ce45a..434a452 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2064,27 +2064,6 @@ void intel_unpin_ringbuffer_obj(struct
>> intel_ringbuffer *ringbuf)
>> i915_gem_object_ggtt_unpin(ringbuf->obj);
>> }
>>
>> -static u32 *vmap_obj(struct drm_i915_gem_object *obj)
>> -{
>> - struct sg_page_iter sg_iter;
>> - struct page **pages;
>> - void *addr;
>> - int i;
>> -
>> - pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages));
>> - if (pages == NULL)
>> - return NULL;
>> -
>> - i = 0;
>> - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0)
>> - pages[i++] = sg_page_iter_page(&sg_iter);
>> -
>> - addr = vmap(pages, i, 0, PAGE_KERNEL);
>> - drm_free_large(pages);
>> -
>> - return addr;
>> -}
>> -
>> int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>> struct intel_ringbuffer *ringbuf)
>> {
>> @@ -2103,7 +2082,8 @@ int intel_pin_and_map_ringbuffer_obj(struct
>> drm_device *dev,
>> return ret;
>> }
>>
>> - ringbuf->virtual_start = vmap_obj(obj);
>> + ringbuf->virtual_start = i915_gem_object_vmap_range(obj, 0,
>> + ringbuf->size >> PAGE_SHIFT);
>
> Here also.
Actually ... if we work in pages but adopt a convention that length==0
means "the whole object" then only copy_batch() would have to do any
page-aligning calculations; all other callers always want the whole
object so it's nice to make the interface convenient for them :)
I'll see how that works out ...
.Dave.
>> if (ringbuf->virtual_start == NULL) {
>> i915_gem_object_ggtt_unpin(obj);
>> return -ENOMEM;
>>
>
> Regards,
> Tvrtko
More information about the Intel-gfx
mailing list