[PATCH 2/3] drm/ttm: fix object deallocation to properly fill in the page pool.

Mon Jul 6 23:39:29 PDT 2015

On 07.07.2015 01:10, Jerome Glisse wrote:
> On Mon, Jul 06, 2015 at 06:11:29PM +0900, Michel Dänzer wrote:
>>
>> Hi Jérôme,
>>
>> On 13.08.2014 12:52, Jérôme Glisse wrote:
>>> From: Jérôme Glisse <jglisse at redhat.com>
>>>
>>> Current code never allowed the page pool to actualy fill in anyway. This fix
>>> it and also allow it to grow over its limit until it grow beyond the batch
>>> size for allocation and deallocation.
>>>
>>> Signed-off-by: Jérôme Glisse <jglisse at redhat.com>
>>> Reviewed-by: Mario Kleiner <mario.kleiner.de at gmail.com>
>>> Tested-by: Michel Dänzer <michel at daenzer.net>
>>> Cc: Thomas Hellstrom <thellstrom at vmware.com>
>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
>>> ---
>>>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 9 ++-------
>>>  1 file changed, 2 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> index c96db43..a076ff3 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>>> @@ -953,14 +953,9 @@ void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>>>  	} else {
>>>  		pool->npages_free += count;
>>>  		list_splice(&ttm_dma->pages_list, &pool->free_list);
>>> -		npages = count;
>>> -		if (pool->npages_free > _manager->options.max_size) {
>>> +		if (pool->npages_free >= (_manager->options.max_size +
>>> +					  NUM_PAGES_TO_ALLOC))
>>>  			npages = pool->npages_free - _manager->options.max_size;
>>> -			/* free at least NUM_PAGES_TO_ALLOC number of pages
>>> -			 * to reduce calls to set_memory_wb */
>>> -			if (npages < NUM_PAGES_TO_ALLOC)
>>> -				npages = NUM_PAGES_TO_ALLOC;
>>> -		}
>>>  	}
>>>  	spin_unlock_irqrestore(&pool->lock, irq_flags);
>>>  
>>>
>>
>> Colleagues of mine have measured significant performance gains for some
>> workloads with this patch. Without it, a lot of CPU cycles are spent
>> changing the caching attributes of pages on allocation.
>>
>> Note that the performance effect seems to mostly disappear when applying
>> patch 1 as well, so apparently 64MB is too small for the max pool size.
>>
>> Is there any chance this patch could be applied without the
>> controversial patch 3? If not, do you have ideas for addressing the
>> concerns raised against patch 3?
> 
> Wahou, now i need to find the keys to the DeLorean to travel back in time.
> 
> This patch is a fix and should be applied without 1 or 3. Because today
> basicly the pool is always emptied and never fill up. But as Thomas pointed
> out there is already bit too much pool accross the stack. Proper solution
> would be to work something inside the mm level or the architecture (i assume
> AMD is mostly interested in x86 on that front).
> 
> Here the whole issue is really about allocating page with WC/UC cache
> properties. Changing cache properties on page is really costly on several
> level, like the kernel needs to break the huge linear mapping and populate
> lower level to remap the page with proper cache attribute inside the kernel
> mapping.
> 
> As far as i remember the kernel never goes back to huge page mapping when
> restoring page cache attribute, which meaning that little by litte with
> uptime you loose the whole huge page mapping benefit for everything and you
> waste more memory.

That sounds pretty bad, but this patch might actually help a little for
that by reducing the amount of huge page mappings that need to be broken up?

> In meantime i think we need to apply this patch as it is really a fix to
> make the code actually do what the comment and design pretends it does.

I agree.

BTW, maybe it should be split up into the actual fix (removing the
npages assignment) and the NUM_PAGES_TO_ALLOC related simplification?

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer