[PATCH 01/11] drm/amdgpu: try allocating VRAM as power of two
Christian König
ckoenig.leichtzumerken at gmail.com
Tue Sep 11 06:49:44 UTC 2018
Yeah well the whole patch set depends on that :)
Otherwise we don't get pages larger than 2MB for the L1 on Vega10.
But another question: Why do you want to clear VRAM on allocation? We
perfectly support allocating VRAM without clearing it.
Regards,
Christian.
Am 11.09.2018 um 02:08 schrieb Felix Kuehling:
> This looks good. But it complicates something I've been looking at:
> Remembering which process drm_mm_nodes last belonged to, so that they
> don't need to be cleared next time they are allocated by the same
> process. Having most nodes the same size (vram_page_split pages) would
> make this very easy and efficient for the most common cases (large
> allocations without any exotic address limitations or alignment
> requirements).
>
> Does anything else in this patch series depend on this optimization?
>
> Regards,
> Felix
>
>
> On 2018-09-09 02:03 PM, Christian König wrote:
>> Try to allocate VRAM in power of two sizes and only fallback to vram
>> split sizes if that fails.
>>
>> Signed-off-by: Christian König <christian.koenig at amd.com>
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 52 +++++++++++++++++++++-------
>> 1 file changed, 40 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> index 9cfa8a9ada92..3f9d5d00c9b3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>> @@ -124,6 +124,28 @@ u64 amdgpu_vram_mgr_bo_visible_size(struct amdgpu_bo *bo)
>> return usage;
>> }
>>
>> +/**
>> + * amdgpu_vram_mgr_virt_start - update virtual start address
>> + *
>> + * @mem: ttm_mem_reg to update
>> + * @node: just allocated node
>> + *
>> + * Calculate a virtual BO start address to easily check if everything is CPU
>> + * accessible.
>> + */
>> +static void amdgpu_vram_mgr_virt_start(struct ttm_mem_reg *mem,
>> + struct drm_mm_node *node)
>> +{
>> + unsigned long start;
>> +
>> + start = node->start + node->size;
>> + if (start > mem->num_pages)
>> + start -= mem->num_pages;
>> + else
>> + start = 0;
>> + mem->start = max(mem->start, start);
>> +}
>> +
>> /**
>> * amdgpu_vram_mgr_new - allocate new ranges
>> *
>> @@ -176,10 +198,25 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
>> pages_left = mem->num_pages;
>>
>> spin_lock(&mgr->lock);
>> - for (i = 0; i < num_nodes; ++i) {
>> + for (i = 0; pages_left >= pages_per_node; ++i) {
>> + unsigned long pages = rounddown_pow_of_two(pages_left);
>> +
>> + r = drm_mm_insert_node_in_range(mm, &nodes[i], pages,
>> + pages_per_node, 0,
>> + place->fpfn, lpfn,
>> + mode);
>> + if (unlikely(r))
>> + break;
>> +
>> + usage += nodes[i].size << PAGE_SHIFT;
>> + vis_usage += amdgpu_vram_mgr_vis_size(adev, &nodes[i]);
>> + amdgpu_vram_mgr_virt_start(mem, &nodes[i]);
>> + pages_left -= pages;
>> + }
>> +
>> + for (; pages_left; ++i) {
>> unsigned long pages = min(pages_left, pages_per_node);
>> uint32_t alignment = mem->page_alignment;
>> - unsigned long start;
>>
>> if (pages == pages_per_node)
>> alignment = pages_per_node;
>> @@ -193,16 +230,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
>>
>> usage += nodes[i].size << PAGE_SHIFT;
>> vis_usage += amdgpu_vram_mgr_vis_size(adev, &nodes[i]);
>> -
>> - /* Calculate a virtual BO start address to easily check if
>> - * everything is CPU accessible.
>> - */
>> - start = nodes[i].start + nodes[i].size;
>> - if (start > mem->num_pages)
>> - start -= mem->num_pages;
>> - else
>> - start = 0;
>> - mem->start = max(mem->start, start);
>> + amdgpu_vram_mgr_virt_start(mem, &nodes[i]);
>> pages_left -= pages;
>> }
>> spin_unlock(&mgr->lock);
More information about the amd-gfx
mailing list