[PATCH v2 1/8] drm: Add dummy page per device or GEM object

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Mon Nov 9 20:53:00 UTC 2020


On 6/22/20 1:50 PM, Daniel Vetter wrote:
> On Mon, Jun 22, 2020 at 7:45 PM Christian König
> <christian.koenig at amd.com> wrote:
>> Am 22.06.20 um 16:32 schrieb Andrey Grodzovsky:
>>> On 6/22/20 9:18 AM, Christian König wrote:
>>>> Am 21.06.20 um 08:03 schrieb Andrey Grodzovsky:
>>>>> Will be used to reroute CPU mapped BO's page faults once
>>>>> device is removed.
>>>>>
>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/drm_file.c  |  8 ++++++++
>>>>>    drivers/gpu/drm/drm_prime.c | 10 ++++++++++
>>>>>    include/drm/drm_file.h      |  2 ++
>>>>>    include/drm/drm_gem.h       |  2 ++
>>>>>    4 files changed, 22 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>>>>> index c4c704e..67c0770 100644
>>>>> --- a/drivers/gpu/drm/drm_file.c
>>>>> +++ b/drivers/gpu/drm/drm_file.c
>>>>> @@ -188,6 +188,12 @@ struct drm_file *drm_file_alloc(struct
>>>>> drm_minor *minor)
>>>>>                goto out_prime_destroy;
>>>>>        }
>>>>>    +    file->dummy_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>>>> +    if (!file->dummy_page) {
>>>>> +        ret = -ENOMEM;
>>>>> +        goto out_prime_destroy;
>>>>> +    }
>>>>> +
>>>>>        return file;
>>>>>      out_prime_destroy:
>>>>> @@ -284,6 +290,8 @@ void drm_file_free(struct drm_file *file)
>>>>>        if (dev->driver->postclose)
>>>>>            dev->driver->postclose(dev, file);
>>>>>    +    __free_page(file->dummy_page);
>>>>> +
>>>>>        drm_prime_destroy_file_private(&file->prime);
>>>>>          WARN_ON(!list_empty(&file->event_list));
>>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
>>>>> index 1de2cde..c482e9c 100644
>>>>> --- a/drivers/gpu/drm/drm_prime.c
>>>>> +++ b/drivers/gpu/drm/drm_prime.c
>>>>> @@ -335,6 +335,13 @@ int drm_gem_prime_fd_to_handle(struct
>>>>> drm_device *dev,
>>>>>          ret = drm_prime_add_buf_handle(&file_priv->prime,
>>>>>                dma_buf, *handle);
>>>>> +
>>>>> +    if (!ret) {
>>>>> +        obj->dummy_page = alloc_page(GFP_KERNEL | __GFP_ZERO);
>>>>> +        if (!obj->dummy_page)
>>>>> +            ret = -ENOMEM;
>>>>> +    }
>>>>> +
>>>> While the per file case still looks acceptable this is a clear NAK
>>>> since it will massively increase the memory needed for a prime
>>>> exported object.
>>>>
>>>> I think that this is quite overkill in the first place and for the
>>>> hot unplug case we can just use the global dummy page as well.
>>>>
>>>> Christian.
>>>
>>> Global dummy page is good for read access, what do you do on write
>>> access ? My first approach was indeed to map at first global dummy
>>> page as read only and mark the vma->vm_flags as !VM_SHARED assuming
>>> that this would trigger Copy On Write flow in core mm
>>> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.7-rc7%2Fsource%2Fmm%2Fmemory.c%23L3977&data=02%7C01%7CAndrey.Grodzovsky%40amd.com%7C3753451d037544e7495408d816d4c4ee%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637284450384586120&sdata=ZpRaQgqA5K4jRfidOiedey0AleeYQ97WNUkGA29ERA0%3D&reserved=0)
>>> on the next page fault to same address triggered by a write access but
>>> then i realized a new COW page will be allocated for each such mapping
>>> and this is much more wasteful then having a dedicated page per GEM
>>> object.
>> Yeah, but this is only for a very very small corner cases. What we need
>> to prevent is increasing the memory usage during normal operation to much.
>>
>> Using memory during the unplug is completely unproblematic because we
>> just released quite a bunch of it by releasing all those system memory
>> buffers.
>>
>> And I'm pretty sure that COWed pages are correctly accounted towards the
>> used memory of a process.
>>
>> So I think if that approach works as intended and the COW pages are
>> released again on unmapping it would be the perfect solution to the problem.
>>
>> Daniel what do you think?
> If COW works, sure sounds reasonable. And if we can make sure we
> managed to drop all the system allocations (otherwise suddenly 2x
> memory usage, worst case). But I have no idea whether we can
> retroshoehorn that into an established vma, you might have fun stuff
> like a mkwrite handler there (which I thought is the COW handler
> thing, but really no idea).


Can you clarify your concern here ? I see no DRM driver besides vmwgfx
who installs a handler for vm_operations_struct.page_mkwrite and in any
case, since I will be turning off VM_SHARED flag for the faulting vm_area_struct
making it a COW, page_mkwrite will not be called on any subsequent vm fault.

Andrey


>
> If we need to massively change stuff then I think rw dummy page,
> allocated on first fault after hotunplug (maybe just make it one per
> object, that's simplest) seems like the much safer option. Much less
> code that can go wrong.
> -Daniel
>
>> Regards,
>> Christian.
>>
>>> We can indeed optimize by allocating this dummy page on the first page
>>> fault after device disconnect instead on GEM object creation.
>>>
>>> Andrey
>>>
>>>
>>>>> mutex_unlock(&file_priv->prime.lock);
>>>>>        if (ret)
>>>>>            goto fail;
>>>>> @@ -1006,6 +1013,9 @@ void drm_prime_gem_destroy(struct
>>>>> drm_gem_object *obj, struct sg_table *sg)
>>>>>            dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL);
>>>>>        dma_buf = attach->dmabuf;
>>>>>        dma_buf_detach(attach->dmabuf, attach);
>>>>> +
>>>>> +    __free_page(obj->dummy_page);
>>>>> +
>>>>>        /* remove the reference */
>>>>>        dma_buf_put(dma_buf);
>>>>>    }
>>>>> diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
>>>>> index 19df802..349a658 100644
>>>>> --- a/include/drm/drm_file.h
>>>>> +++ b/include/drm/drm_file.h
>>>>> @@ -335,6 +335,8 @@ struct drm_file {
>>>>>         */
>>>>>        struct drm_prime_file_private prime;
>>>>>    +    struct page *dummy_page;
>>>>> +
>>>>>        /* private: */
>>>>>    #if IS_ENABLED(CONFIG_DRM_LEGACY)
>>>>>        unsigned long lock_count; /* DRI1 legacy lock count */
>>>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
>>>>> index 0b37506..47460d1 100644
>>>>> --- a/include/drm/drm_gem.h
>>>>> +++ b/include/drm/drm_gem.h
>>>>> @@ -310,6 +310,8 @@ struct drm_gem_object {
>>>>>         *
>>>>>         */
>>>>>        const struct drm_gem_object_funcs *funcs;
>>>>> +
>>>>> +    struct page *dummy_page;
>>>>>    };
>>>>>      /**
>


More information about the amd-gfx mailing list