[PATCH v11] drm/amdgpu: add drm buddy support to amdgpu
Christian König
christian.koenig at amd.com
Tue Mar 29 11:24:02 UTC 2022
Am 29.03.22 um 13:19 schrieb Arunpravin Paneer Selvam:
> [SNIP]
>>> + pages_left = node->base.num_pages;
>>>
>>> i = 0;
>>> - spin_lock(&mgr->lock);
>>> while (pages_left) {
>>> - uint32_t alignment = tbo->page_alignment;
>>> + if (tbo->page_alignment)
>>> + min_page_size = tbo->page_alignment << PAGE_SHIFT;
>>> + else
>>> + min_page_size = mgr->default_page_size;
>> The handling here looks extremely awkward to me.
>>
>> min_page_size should be determined outside of the loop, based on default_page_size, alignment and contiguous flag.
> I kept min_page_size determine logic inside the loop for cases 2GiB+
> requirements, Since now we are round up the size to the required
> alignment, I modified the min_page_size determine logic outside of the
> loop in v12. Please review.
Ah! So do we only have the loop so that each allocation isn't bigger
than 2GiB? If yes couldn't we instead add a max_alloc_size or something
similar?
BTW: I strongly suggest that you rename min_page_size to min_alloc_size.
Otherwise somebody could think that those numbers are in pages and not
bytes.
>> Then why do you drop the lock and grab it again inside the loop? And what is "i" actually good for?
> modified the lock/unlock placement in v12.
>
> "i" is to track when there is 2GiB+ contiguous allocation request, first
> we allocate 2GiB (due to SG table limit) continuously and the remaining
> pages in the next iteration, hence this request can't be a continuous.
> To set the placement flag we make use of "i" value. In our case "i"
> value becomes 2 and we don't set the below flag.
> node->base.placement |= TTM_PL_FLAG_CONTIGUOUS;
>
> If we don't get such requests, I will remove "i".
I'm not sure if that works.
As far as I can see drm_buddy_alloc_blocks() can allocate multiple
blocks at the same time, but i is only incremented when we loop.
So what you should do instead is to check if node->blocks just contain
exactly one element after the allocation but before the trim.
>>> +
>>> + /* Limit maximum size to 2GB due to SG table limitations */
>>> + pages = min(pages_left, 2UL << (30 - PAGE_SHIFT));
>>>
>>> if (pages >= pages_per_node)
>>> - alignment = pages_per_node;
>>> -
>>> - r = drm_mm_insert_node_in_range(mm, &node->mm_nodes[i], pages,
>>> - alignment, 0, place->fpfn,
>>> - lpfn, mode);
>>> - if (unlikely(r)) {
>>> - if (pages > pages_per_node) {
>>> - if (is_power_of_2(pages))
>>> - pages = pages / 2;
>>> - else
>>> - pages = rounddown_pow_of_two(pages);
>>> - continue;
>>> - }
>>> - goto error_free;
>>> + min_page_size = pages_per_node << PAGE_SHIFT;
>>> +
>>> + if (!is_contiguous && !IS_ALIGNED(pages, min_page_size >> PAGE_SHIFT))
>>> + is_contiguous = 1;
>>> +
>>> + if (is_contiguous) {
>>> + pages = roundup_pow_of_two(pages);
>>> + min_page_size = pages << PAGE_SHIFT;
>>> +
>>> + if (pages > lpfn)
>>> + lpfn = pages;
>>> }
>>>
>>> - vis_usage += amdgpu_vram_mgr_vis_size(adev, &node->mm_nodes[i]);
>>> - amdgpu_vram_mgr_virt_start(&node->base, &node->mm_nodes[i]);
>>> - pages_left -= pages;
>>> + BUG_ON(min_page_size < mm->chunk_size);
>>> +
>>> + mutex_lock(&mgr->lock);
>>> + r = drm_buddy_alloc_blocks(mm, (u64)place->fpfn << PAGE_SHIFT,
>>> + (u64)lpfn << PAGE_SHIFT,
>>> + (u64)pages << PAGE_SHIFT,
>>> + min_page_size,
>>> + &node->blocks,
>>> + node->flags);
>>> + mutex_unlock(&mgr->lock);
>>> + if (unlikely(r))
>>> + goto error_free_blocks;
>>> +
>>> ++i;
>>>
>>> if (pages > pages_left)
>>> - pages = pages_left;
>>> + pages_left = 0;
>>> + else
>>> + pages_left -= pages;
>>> }
>>> - spin_unlock(&mgr->lock);
>>>
>>> - if (i == 1)
>>> + /* Free unused pages for contiguous allocation */
>>> + if (is_contiguous) {
>> Well that looks really odd, why is trimming not part of
>> drm_buddy_alloc_blocks() ?
> we didn't place trim function part of drm_buddy_alloc_blocks since we
> thought this function can be a generic one and it can be used by any
> other application as well. For example, now we are using it for trimming
> the last block in case of size non-alignment with min_page_size.
Good argument. Another thing I just realized is that we probably want to
double check if we only allocated one block before the trim.
Thanks,
Christian.
More information about the amd-gfx
mailing list