[PATCH] drm/ttm: Merge hugepage attr changes in ttm_dma_page_put.

Bas Nieuwenhuizen basni at chromium.org
Thu Jul 26 09:06:39 UTC 2018


On Thu, Jul 26, 2018 at 7:52 AM, Zhang, Jerry (Junwei)
<Jerry.Zhang at amd.com> wrote:
> On 07/26/2018 04:29 AM, Bas Nieuwenhuizen wrote:
>>
>> Every set_pages_array_wb call resulted in cross-core
>> interrupts and TLB flushes. Merge more of them for
>> less overhead.
>>
>> This reduces the time needed to free a 1.6 GiB GTT WC
>> buffer as part of Vulkan CTS from  ~2 sec to < 0.25 sec.
>> (Allocation still takes more than 2 sec though)
>>
>> Signed-off-by: Bas Nieuwenhuizen <basni at chromium.org>
>> ---
>>   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 31 ++++++++++++++++++------
>>   1 file changed, 24 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> index 4c659405a008a..9440ba0a55116 100644
>> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
>> @@ -299,6 +299,25 @@ static int set_pages_array_uc(struct page **pages,
>> int addrinarray)
>>   #endif
>>         return 0;
>>   }
>> +
>> +static int ttm_set_page_range_wb(struct page *p, unsigned long numpages)
>> +{
>> +#if IS_ENABLED(CONFIG_AGP)
>> +        unsigned long i;
>> +
>> +        for (i = 0; i < numpages; i++)
>> +                unmap_page_from_agp(p + i);
>> +#endif
>> +       return 0;
>> +}
>> +
>> +#else /* for !CONFIG_X86 */
>> +
>> +static int ttm_set_page_range_wb(struct page *p, unsigned long numpages)
>> +{
>> +       return set_memory_wb((unsigned long)page_address(p), numpages);
>> +}
>> +
>>   #endif /* for !CONFIG_X86 */
>>
>>   static int ttm_set_pages_caching(struct dma_pool *pool,
>> @@ -387,18 +406,16 @@ static void ttm_pool_update_free_locked(struct
>> dma_pool *pool,
>>   static void ttm_dma_page_put(struct dma_pool *pool, struct dma_page
>> *d_page)
>>   {
>>         struct page *page = d_page->p;
>> -       unsigned i, num_pages;
>> +       unsigned num_pages;
>>         int ret;
>>
>>         /* Don't set WB on WB page pool. */
>>         if (!(pool->type & IS_CACHED)) {
>>                 num_pages = pool->size / PAGE_SIZE;
>> -               for (i = 0; i < num_pages; ++i, ++page) {
>> -                       ret = set_pages_array_wb(&page, 1);
>> -                       if (ret) {
>> -                               pr_err("%s: Failed to set %d pages to
>> wb!\n",
>> -                                      pool->dev_name, 1);
>> -                       }
>> +               ret = ttm_set_page_range_wb(page, num_pages);
>
>
> For AGP enabled, set_pages_array_wc() could works like that by passing
> "num_pages" instead of "1"
> In X86 case, we may use set_pages_array_wb() in arch/x86/mm/pageattr.c.
>
> so, does it work as below?
>
> ret = set_pages_array_wb(page, num_pages);

No that would not work. Note that we have an array of page structs,
while set_pages_array_wb() wants an array of pointers to page structs.
We could allocate a temporary array and write the pointers but that
seems unnecessarily inefficient to me, and probably also does not
achieve a reduction in code.

>
> Jerry
>
>
>> +               if (ret) {
>> +                       pr_err("%s: Failed to set %d pages to wb!\n",
>> +                              pool->dev_name, num_pages);
>>                 }
>>         }
>>
>>
>


More information about the amd-gfx mailing list