[Intel-gfx] [PATCH 3/6] drm/i915: use vmap in shmem_pin_map

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Tue Sep 22 17:04:37 UTC 2020


On 22/09/2020 17:33, Christoph Hellwig wrote:
> On Tue, Sep 22, 2020 at 05:13:45PM +0100, Tvrtko Ursulin wrote:
>>>    void *shmem_pin_map(struct file *file)
>>>    {
>>> -	const size_t n_pte = shmem_npte(file);
>>> -	pte_t *stack[32], **ptes, **mem;
>>
>> Chris can comment how much he'd miss the 32 page stack shortcut.
> 
> I'd like to see a profile that claim that kmalloc matters in a
> path that does a vmap and reads pages through the page cache.
> Especially when the kmalloc saves doing another page cache lookup
> on the free side.

Only reason I can come up with now is if mapping side is on a latency 
sensitive path, while un-mapping is lazy/delayed so can be more costly. 
Then fast map and extra cost on unmap may make sense.

It more applies to the other i915 patch, which implements a much more 
used API, but whether or not we can demonstrate any difference in the 
perf profiles I couldn't tell you without trying to collect some.

>> Is there something in vmap() preventing us from freeing the pages array
>> here? I can't spot anything that is holding on to the pointer. Or it was
>> just a sketch before you realized we could walk the vm_area?
>>
>> Also, I may be totally misunderstanding something, but I think you need to
>> assign area->pages manually so shmem_unpin_map can access it below.
> 
> We need area->pages to hold the pages for the free side.  That being
> said the patch I posted is broken because it never assigned to that.
> As said it was a sketch.  This is the patch I just rebooted into on
> my Laptop:
> 
> http://git.infradead.org/users/hch/misc.git/commitdiff/048522dfa26b6667adfb0371ff530dc263abe829
> 
> it needs extra prep patches from the series:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/alloc_vm_area
> 
>>>    	mapping_clear_unevictable(file->f_mapping);
>>> -	__shmem_unpin_map(file, ptr, shmem_npte(file));
>>> +	for (i = 0; i < shmem_npages(file); i++)
>>> +		put_page(area->pages[i]);
>>> +	kvfree(area->pages);
>>> +	vunmap(ptr);
>>
>> Is the verdict from mm experts that we can't use vfree due __free_pages vs
>> put_page differences?
> 
> Switched to vfree now.
> 
>> Could we get from ptes to pages, so that we don't have to keep the
>> area->pages array allocated for the duration of the pin?
> 
> We could do vmalloc_to_page, but that is fairly expensive (not as bad
> as reading from the page cache..).  Are you really worried about the
> allocation?

Not so much given how we don't even use shmem_pin_map outside selftests.

If we start using it I expect it will be for tiny objects anyway. Only 
if they end up being pinned for the lifetime of the driver, it may be a 
pointless waste of memory compared to the downsides of vmalloc_to_page. 
But we can revisit this particular edge case optimization if the need 
arises.

I'll look at your other i915 patch tomorrow.

Regards,

Tvrtko


More information about the dri-devel mailing list