[Intel-gfx] [RFC PATCH 1/1] drm/mm: add ig_frag selftest
Nirmoy
nirmodas at amd.com
Wed Jun 3 09:10:29 UTC 2020
On 6/2/20 4:25 PM, Christian König wrote:
> Am 02.06.20 um 16:13 schrieb Nirmoy:
>> Hi Christian,
>>
>> On 6/2/20 2:47 PM, Christian König wrote:
>>> Nirmoy please keep in mind that your current implementation doesn't
>>> fully solve the issue the test case is exercising.
>>>
>>> In other words what you have implement is fast skipping of
>>> fragmented address space for bottom-up and top-down.
>>>
>>> But what this test here exercises is the fast skipping of aligned
>>> allocations. You should probably adjust the test case a bit.
>>
>>
>> Allocations with size=4k and aign = 8k is known to introduce
>> fragmentation,
>
> Yes, but this fragmentation can't be avoided with what we already
> implemented. For this we would need the extension with the alignment I
> already explained.
>
>> do you mean I should only test bottom-up and top-down
>>
>> for now ?
>
> Yes and no.
>
> What we need to test is the following:
>
> 1. Make tons of allocations with size=4k and align=0.
>
> 2. Free every other of those allocations.
>
> 3. Make tons of allocations with size=8k and align=0.
>
> Previously bottom-up and top-down would have checked all the holes
> created in step #2.
>
> With your change they can immediately see that this doesn't make sense
> and shortcut to the leftmost/rightmost leaf node in the tree with the
> large free block.
>
> That we can handle the alignment as well is the next step of that.
Thanks Christian for the detailed explanation. I have modified this as
you suggested, will send in few minutes.
Regards,
Nirmoy
>
> Regards,
> Christian.
>
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>>
>>>
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 29.05.20 um 23:01 schrieb Nirmoy:
>>>>
>>>> On 5/29/20 5:52 PM, Chris Wilson wrote:
>>>>> Quoting Nirmoy (2020-05-29 16:40:53)
>>>>>> This works correctly most of the times but sometimes
>>>>
>>>>
>>>> I have to take my word back. In another machine, 20k insertions in
>>>>
>>>> best mode takes 6-9 times more than 10k insertions, all most all
>>>> the time.
>>>>
>>>> evict, bottom-up and top-down modes remains in 2-5 times range.
>>>>
>>>>
>>>> If I reduce the insertions to 1k and 2k then scaling factor for
>>>> best mode stays below 4 most of the time.
>>>>
>>>> evict, bottom-up and top-down modes remains in 2-3 times range.
>>>>
>>>>
>>>> I wonder if it makes sense to test with only 1k and 2k insertions
>>>> and tolerate more than error if the mode == best.
>>>>
>>>> Regards,
>>>>
>>>> Nirmoy
>>>>
>>>>>>
>>>>>> 20k insertions can take more than 8 times of 10k insertion time.
>>>>> The pressure is on to improve then :)
>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Nirmoy
>>>>>>
>>>>>> On 5/29/20 6:33 PM, Nirmoy Das wrote:
>>>>>>> This patch introduces fragmentation in the address range
>>>>>>> and measures time taken by 10k and 20k insertions. ig_frag()
>>>>>>> will fail if time taken by 20k insertions takes more than 4 times
>>>>>>> of 10k insertions as we know that insertions scale quadratically.
>>>>>>> Also tolerate 10% error because of kernel scheduler's jitters.
>>>>>>>
>>>>>>> Output:
>>>>>>> <snip>
>>>>>>> [ 8092.653518] drm_mm: Testing DRM range manger (struct drm_mm),
>>>>>>> with random_seed=0x9bfb4117 max_iterations=8192 max_prime=128
>>>>>>> [ 8092.653520] drm_mm: igt_sanitycheck - ok!
>>>>>>> [ 8092.653525] igt_debug 0x0000000000000000-0x0000000000000200:
>>>>>>> 512: free
>>>>>>> [ 8092.653526] igt_debug 0x0000000000000200-0x0000000000000600:
>>>>>>> 1024: used
>>>>>>> [ 8092.653527] igt_debug 0x0000000000000600-0x0000000000000a00:
>>>>>>> 1024: free
>>>>>>> [ 8092.653528] igt_debug 0x0000000000000a00-0x0000000000000e00:
>>>>>>> 1024: used
>>>>>>> [ 8092.653529] igt_debug 0x0000000000000e00-0x0000000000001000:
>>>>>>> 512: free
>>>>>>> [ 8092.653529] igt_debug total: 4096, used 2048 free 2048
>>>>>>> [ 8112.569813] drm_mm: best fragmented insert of 10000 and 20000
>>>>>>> insertions took 504 and 1996 msecs
>>>>>>> [ 8112.723254] drm_mm: bottom-up fragmented insert of 10000 and
>>>>>>> 20000 insertions took 44 and 108 msecs
>>>>>>> [ 8112.813212] drm_mm: top-down fragmented insert of 10000 and
>>>>>>> 20000 insertions took 40 and 44 msecs
>>>>>>> [ 8112.847733] drm_mm: evict fragmented insert of 10000 and
>>>>>>> 20000 insertions took 8 and 20 msecs
>>>>>>> <snip>
>>>>>>>
>>>>>>> Signed-off-by: Nirmoy Das <nirmoy.das at amd.com>
>>>>>>> ---
>>>>>>> drivers/gpu/drm/selftests/drm_mm_selftests.h | 1 +
>>>>>>> drivers/gpu/drm/selftests/test-drm_mm.c | 73
>>>>>>> ++++++++++++++++++++
>>>>>>> 2 files changed, 74 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h
>>>>>>> b/drivers/gpu/drm/selftests/drm_mm_selftests.h
>>>>>>> index 6b943ea1c57d..8c87c964176b 100644
>>>>>>> --- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
>>>>>>> +++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
>>>>>>> @@ -14,6 +14,7 @@ selftest(insert, igt_insert)
>>>>>>> selftest(replace, igt_replace)
>>>>>>> selftest(insert_range, igt_insert_range)
>>>>>>> selftest(align, igt_align)
>>>>>>> +selftest(frag, igt_frag)
>>>>>>> selftest(align32, igt_align32)
>>>>>>> selftest(align64, igt_align64)
>>>>>>> selftest(evict, igt_evict)
>>>>>>> diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c
>>>>>>> b/drivers/gpu/drm/selftests/test-drm_mm.c
>>>>>>> index 9aabe82dcd3a..05d8f3659b4d 100644
>>>>>>> --- a/drivers/gpu/drm/selftests/test-drm_mm.c
>>>>>>> +++ b/drivers/gpu/drm/selftests/test-drm_mm.c
>>>>>>> @@ -1033,6 +1033,79 @@ static int igt_insert_range(void *ignored)
>>>>>>> return 0;
>>>>>>> }
>>>>>>> +static int get_insert_time(unsigned int num_insert,
>>>>>>> + const struct insert_mode *mode)
>>>>>>> +{
>>>>>>> + struct drm_mm mm;
>>>>>>> + struct drm_mm_node *nodes, *node, *next;
>>>>>>> + unsigned int size = 4096, align = 8192;
>>>>>>> + unsigned long start;
>>>>>>> + unsigned int i;
>>>>>>> + int ret = -EINVAL;
>>>>>>> +
>>>>>>> + drm_mm_init(&mm, 1, U64_MAX - 2);
>>>>>>> + nodes = vzalloc(array_size(num_insert, sizeof(*nodes)));
>>>>>>> + if (!nodes)
>>>>>>> + goto err;
>>>>>>> +
>>>>>>> + start = jiffies;
>>>>> Use ktime_t start = ktime_now();
>>>>>
>>>>>>> + for (i = 0; i < num_insert; i++) {
>>>>>>> + if (!expect_insert(&mm, &nodes[i], size, align, i,
>>>>>>> mode)) {
>>>>>>> + pr_err("%s insert failed\n", mode->name);
>>>>>>> + goto out;
>>>>>>> + }
>>>>>>> + }
>>>>>>> +
>>>>>>> + ret = jiffies_to_msecs(jiffies - start);
>>>>> ret = ktime_sub(ktime_now(), start);
>>>>>
>>>>> The downside to using ktime is remembering it is s64 and so
>>>>> requires care
>>>>> and attention in doing math.
>>>>>
>>>>>>> +out:
>>>>>>> + drm_mm_for_each_node_safe(node, next, &mm)
>>>>>>> + drm_mm_remove_node(node);
>>>>>>> + drm_mm_takedown(&mm);
>>>>>>> + vfree(nodes);
>>>>>>> +err:
>>>>>>> + return ret;
>>>>>>> +
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int igt_frag(void *ignored)
>>>>>>> +{
>>>>>>> + const struct insert_mode *mode;
>>>>>>> + unsigned int insert_time1, insert_time2;
>>>>>>> + unsigned int insert_size = 10000;
>>>>>>> + unsigned int scale_factor = 4;
>>>>>>> + /* tolerate 10% excess insertion duration */
>>>>>>> + unsigned int error_factor = 110;
>>>>>>> + int ret = -EINVAL;
>>>>>>> +
>>>>>>> + for (mode = insert_modes; mode->name; mode++) {
>>>>>>> + unsigned int expected_time;
>>>>>>> +
>>>>>>> + insert_time1 = get_insert_time(insert_size, mode);
>>>>>>> + if (insert_time1 < 0)
>>>>>>> + goto err;
>>>>> Ah, can you propagate the actual error. I see you are returning
>>>>> EINVAL
>>>>> for ENOMEM errors. Just wait until it hits and you have to debug
>>>>> why :)
>>>>>
>>>>>>> + insert_time2 = get_insert_time((insert_size * 2),
>>>>>>> mode);
>>>>>>> + if (insert_time2 < 0)
>>>>>>> + goto err;
>>>>>>> +
>>>>>>> + expected_time = (scale_factor * insert_time1 *
>>>>>>> + error_factor)/100;
>>>>>>> + if (insert_time2 > expected_time) {
>>>>>>> + pr_err("%s fragmented insert took more %u
>>>>>>> msecs\n",
>>>>>>> + mode->name, insert_time2 -
>>>>>>> expected_time);
>>>>>>> + goto err;
>>>>>>> + }
>>>>>>> +
>>>>>>> + pr_info("%s fragmented insert of %u and %u
>>>>>>> insertions took %u and %u msecs\n",
>>>>>>> + mode->name, insert_size, insert_size * 2,
>>>>>>> insert_time1,
>>>>>>> + insert_time2);
>>>>> Put the info first before the error. We always want the full details,
>>>>> with the error message explaining why it's unhappy.
>>>>> -Chris
>>>>> _______________________________________________
>>>>> dri-devel mailing list
>>>>> dri-devel at lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=02%7C01%7Cnirmoy.das%40amd.com%7C5c7df129b9cf44b3ae4008d803e84445%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637263643415833767&sdata=PrCQse4nhN0ZITT9OniuHhF7A5uxJD6ehk0PMjm7WMU%3D&reserved=0
>>>>>
>>>
>
More information about the Intel-gfx
mailing list