[Intel-gfx] [PATCH] drm/i915/selftests: fixup igt_shrink_thp

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Sep 6 12:53:38 UTC 2021


On 06/09/2021 13:30, Matthew Auld wrote:
> On 06/09/2021 13:19, Tvrtko Ursulin wrote:
>>
>> On 06/09/2021 10:17, Matthew Auld wrote:
>>> Since the object might still be active here, the shrink_all will simply
>>> ignore it, which blows up in the test, since the pages will still be
>>> there. Currently THP is disabled which should result in the test being
>>> skipped, but if we ever re-enable THP we might start seeing the failure.
>>> Fix this by forcing I915_SHRINK_ACTIVE.
>>>
>>> v2: Some machine in the shard runs doesn't seem to have any available
>>> swap when running this test. Try to handle this.
>>>
>>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com> #v1
>>> ---
>>>   .../gpu/drm/i915/gem/selftests/huge_pages.c   | 31 ++++++++++++++-----
>>>   1 file changed, 24 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
>>> b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> index a094f3ce1a90..46ea1997c114 100644
>>> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
>>> @@ -1519,6 +1519,7 @@ static int igt_shrink_thp(void *arg)
>>>       struct i915_vma *vma;
>>>       unsigned int flags = PIN_USER;
>>>       unsigned int n;
>>> +    bool should_swap;
>>>       int err = 0;
>>>       /*
>>> @@ -1567,23 +1568,39 @@ static int igt_shrink_thp(void *arg)
>>>               break;
>>>       }
>>>       i915_gem_context_unlock_engines(ctx);
>>> +    /*
>>> +     * Nuke everything *before* we unpin the pages so we can be 
>>> reasonably
>>> +     * sure that when later checking get_nr_swap_pages() that some 
>>> random
>>> +     * leftover object doesn't steal the remaining swap space.
>>> +     */
>>> +    i915_gem_shrink(NULL, i915, -1UL, NULL,
>>> +            I915_SHRINK_BOUND |
>>> +            I915_SHRINK_UNBOUND |
>>> +            I915_SHRINK_ACTIVE);
>>>       i915_vma_unpin(vma);
>>>       if (err)
>>>           goto out_put;
>>> +
>>>       /*
>>> -     * Now that the pages are *unpinned* shrink-all should invoke
>>> -     * shmem to truncate our pages.
>>> +     * Now that the pages are *unpinned* shrinking should invoke
>>> +     * shmem to truncate our pages, if we have available swap.
>>>        */
>>> -    i915_gem_shrink_all(i915);
>>> -    if (i915_gem_object_has_pages(obj)) {
>>> -        pr_err("shrink-all didn't truncate the pages\n");
>>> +    should_swap = get_nr_swap_pages() > 0;
>>> +    i915_gem_shrink(NULL, i915, -1UL, NULL,
>>> +            I915_SHRINK_BOUND |
>>> +            I915_SHRINK_UNBOUND |
>>> +            I915_SHRINK_ACTIVE);
>>> +    if (should_swap == i915_gem_object_has_pages(obj)) {
>>
>> Hmm is there any value running the test if no swap (given objects used 
>> by the test are "willneed"), or you could simplify and just do early 
>> skip?
> 
> Maybe. My thinking was that this adds some coverage if say the device is 
> not configured with swap. i.e assert that the pages don't magically 
> disappear, and that their contents still persist etc.
> 
> Happy to make it skip instead though?

So reducing it to a basic shrinker test in that case. Hm.. do you know 
if we have a non THP specific tests for that already somewhere in 
selftests (I can't spot any), or just in IGT?

If we indeed don't have it in selftests, then I guess question is 
whether it is warranted to "hide" such a basic test in the THP "drawer", 
or instead adding a generic shrinker test should be considered. (And one 
could then follow with a question should a basic generic test have a THP 
sub-test.)

It's hard to say where the boundary for selftests-vs-IGT coverage should 
be in this case. I mean would it be warranted to add such a generic 
shrinker selftest. It is mostly testable from userspace, but kernel can 
do a few more introspections and sanity checks at cost of growing kernel 
code.

Regards,

Tvrtko

> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>> +        pr_err("unexpected pages mismatch, should_swap=%s\n",
>>> +               yesno(should_swap));
>>>           err = -EINVAL;
>>>           goto out_put;
>>>       }
>>> -    if (obj->mm.page_sizes.sg || obj->mm.page_sizes.phys) {
>>> -        pr_err("residual page-size bits left\n");
>>> +    if (should_swap == (obj->mm.page_sizes.sg || 
>>> obj->mm.page_sizes.phys)) {
>>> +        pr_err("unexpected residual page-size bits, should_swap=%s\n",
>>> +               yesno(should_swap));
>>>           err = -EINVAL;
>>>           goto out_put;
>>>       }
>>>


More information about the Intel-gfx mailing list