[PATCH v3 01/12] drm: Add dummy page per device or GEM object

Mon Jan 11 18:31:00 UTC 2021

On 1/11/21 12:41 PM, Andrey Grodzovsky wrote:
>
> On 1/11/21 11:15 AM, Daniel Vetter wrote:
>> On Mon, Jan 11, 2021 at 05:13:56PM +0100, Daniel Vetter wrote:
>>> On Fri, Jan 08, 2021 at 04:49:55PM +0000, Grodzovsky, Andrey wrote:
>>>> Ok then, I guess I will proceed with the dummy pages list implementation then.
>>>>
>>>> Andrey
>>>>
>>>> ________________________________
>>>> From: Koenig, Christian <Christian.Koenig at amd.com>
>>>> Sent: 08 January 2021 09:52
>>>> To: Grodzovsky, Andrey <Andrey.Grodzovsky at amd.com>; Daniel Vetter 
>>>> <daniel at ffwll.ch>
>>>> Cc: amd-gfx at lists.freedesktop.org <amd-gfx at lists.freedesktop.org>; 
>>>> dri-devel at lists.freedesktop.org <dri-devel at lists.freedesktop.org>; 
>>>> daniel.vetter at ffwll.ch <daniel.vetter at ffwll.ch>; robh at kernel.org 
>>>> <robh at kernel.org>; l.stach at pengutronix.de <l.stach at pengutronix.de>; 
>>>> yuq825 at gmail.com <yuq825 at gmail.com>; eric at anholt.net <eric at anholt.net>; 
>>>> Deucher, Alexander <Alexander.Deucher at amd.com>; gregkh at linuxfoundation.org 
>>>> <gregkh at linuxfoundation.org>; ppaalanen at gmail.com <ppaalanen at gmail.com>; 
>>>> Wentland, Harry <Harry.Wentland at amd.com>
>>>> Subject: Re: [PATCH v3 01/12] drm: Add dummy page per device or GEM object
>>>>
>>>> Mhm, I'm not aware of any let over pointer between TTM and GEM and we
>>>> worked quite hard on reducing the size of the amdgpu_bo, so another
>>>> extra pointer just for that corner case would suck quite a bit.
>>> We have a ton of other pointers in struct amdgpu_bo (or any of it's lower
>>> things) which are fairly single-use, so I'm really not much seeing the
>>> point in making this a special case. It also means the lifetime management
>>> becomes a bit iffy, since we can't throw away the dummy page then the last
>>> reference to the bo is released (since we don't track it there), but only
>>> when the last pointer to the device is released. Potentially this means a
>>> pile of dangling pages hanging around for too long.
>> Also if you really, really, really want to have this list, please don't
>> reinvent it since we have it already. drmm_ is exactly meant for resources
>> that should be freed when the final drm_device reference disappears.
>> -Daniel
>
>
> Can you elaborate ? We still need to actually implement the list but you want 
> me to use
> drmm_add_action for it's destruction instead of explicitly doing it (like I'm 
> already doing from  ttm_bo_device_release) ?
>
> Andrey

Oh, i get it i think, you want me to allocate each page using drmm_kzalloc so 
when drm_dev dies it will be freed on it's own.
Great idea and makes my implementation much less cumbersome.

Andrey

>
>
>>> If you need some ideas for redundant pointers:
>>> - destroy callback (kinda not cool to not have this const anyway), we
>>>    could refcount it all with the overall gem bo. Quite a bit of work.
>>> - bdev pointer, if we move the device ttm stuff into struct drm_device, or
>>>    create a common struct ttm_device, we can ditch that
>>> - We could probably merge a few of the fields and find 8 bytes somewhere
>>> - we still have 2 krefs, would probably need to fix that before we can
>>>    merge the destroy callbacks
>>>
>>> So there's plenty of room still, if the size of a bo struct is really that
>>> critical. Imo it's not.
>>>
>>>
>>>> Christian.
>>>>
>>>> Am 08.01.21 um 15:46 schrieb Andrey Grodzovsky:
>>>>> Daniel had some objections to this (see bellow) and so I guess I need
>>>>> you both to agree on the approach before I proceed.
>>>>>
>>>>> Andrey
>>>>>
>>>>> On 1/8/21 9:33 AM, Christian König wrote:
>>>>>> Am 08.01.21 um 15:26 schrieb Andrey Grodzovsky:
>>>>>>> Hey Christian, just a ping.
>>>>>> Was there any question for me here?
>>>>>>
>>>>>> As far as I can see the best approach would still be to fill the VMA
>>>>>> with a single dummy page and avoid pointers in the GEM object.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Andrey
>>>>>>>
>>>>>>> On 1/7/21 11:37 AM, Andrey Grodzovsky wrote:
>>>>>>>> On 1/7/21 11:30 AM, Daniel Vetter wrote:
>>>>>>>>> On Thu, Jan 07, 2021 at 11:26:52AM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>> On 1/7/21 11:21 AM, Daniel Vetter wrote:
>>>>>>>>>>> On Tue, Jan 05, 2021 at 04:04:16PM -0500, Andrey Grodzovsky wrote:
>>>>>>>>>>>> On 11/23/20 3:01 AM, Christian König wrote:
>>>>>>>>>>>>> Am 23.11.20 um 05:54 schrieb Andrey Grodzovsky:
>>>>>>>>>>>>>> On 11/21/20 9:15 AM, Christian König wrote:
>>>>>>>>>>>>>>> Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
>>>>>>>>>>>>>>>> Will be used to reroute CPU mapped BO's page faults once
>>>>>>>>>>>>>>>> device is removed.
>>>>>>>>>>>>>>> Uff, one page for each exported DMA-buf? That's not
>>>>>>>>>>>>>>> something we can do.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We need to find a different approach here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can't we call alloc_page() on each fault and link them together
>>>>>>>>>>>>>>> so they are freed when the device is finally reaped?
>>>>>>>>>>>>>> For sure better to optimize and allocate on demand when we reach
>>>>>>>>>>>>>> this corner case, but why the linking ?
>>>>>>>>>>>>>> Shouldn't drm_prime_gem_destroy be good enough place to free ?
>>>>>>>>>>>>> I want to avoid keeping the page in the GEM object.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What we can do is to allocate a page on demand for each fault
>>>>>>>>>>>>> and link
>>>>>>>>>>>>> the together in the bdev instead.
>>>>>>>>>>>>>
>>>>>>>>>>>>> And when the bdev is then finally destroyed after the last
>>>>>>>>>>>>> application
>>>>>>>>>>>>> closed we can finally release all of them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Christian.
>>>>>>>>>>>> Hey, started to implement this and then realized that by
>>>>>>>>>>>> allocating a page
>>>>>>>>>>>> for each fault indiscriminately
>>>>>>>>>>>> we will be allocating a new page for each faulting virtual
>>>>>>>>>>>> address within a
>>>>>>>>>>>> VA range belonging the same BO
>>>>>>>>>>>> and this is obviously too much and not the intention. Should I
>>>>>>>>>>>> instead use
>>>>>>>>>>>> let's say a hashtable with the hash
>>>>>>>>>>>> key being faulting BO address to actually keep allocating and
>>>>>>>>>>>> reusing same
>>>>>>>>>>>> dummy zero page per GEM BO
>>>>>>>>>>>> (or for that matter DRM file object address for non imported
>>>>>>>>>>>> BOs) ?
>>>>>>>>>>> Why do we need a hashtable? All the sw structures to track this
>>>>>>>>>>> should
>>>>>>>>>>> still be around:
>>>>>>>>>>> - if gem_bo->dma_buf is set the buffer is currently exported as
>>>>>>>>>>> a dma-buf,
>>>>>>>>>>>      so defensively allocate a per-bo page
>>>>>>>>>>> - otherwise allocate a per-file page
>>>>>>>>>> That exactly what we have in current implementation
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Or is the idea to save the struct page * pointer? That feels a
>>>>>>>>>>> bit like
>>>>>>>>>>> over-optimizing stuff. Better to have a simple implementation
>>>>>>>>>>> first and
>>>>>>>>>>> then tune it if (and only if) any part of it becomes a problem
>>>>>>>>>>> for normal
>>>>>>>>>>> usage.
>>>>>>>>>> Exactly - the idea is to avoid adding extra pointer to
>>>>>>>>>> drm_gem_object,
>>>>>>>>>> Christian suggested to instead keep a linked list of dummy pages
>>>>>>>>>> to be
>>>>>>>>>> allocated on demand once we hit a vm_fault. I will then also
>>>>>>>>>> prefault the entire
>>>>>>>>>> VA range from vma->vm_end - vma->vm_start to vma->vm_end and map
>>>>>>>>>> them
>>>>>>>>>> to that single dummy page.
>>>>>>>>> This strongly feels like premature optimization. If you're worried
>>>>>>>>> about
>>>>>>>>> the overhead on amdgpu, pay down the debt by removing one of the
>>>>>>>>> redundant
>>>>>>>>> pointers between gem and ttm bo structs (I think we still have
>>>>>>>>> some) :-)
>>>>>>>>>
>>>>>>>>> Until we've nuked these easy&obvious ones we shouldn't play "avoid 1
>>>>>>>>> pointer just because" games with hashtables.
>>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>> Well, if you and Christian can agree on this approach and suggest
>>>>>>>> maybe what pointer is
>>>>>>>> redundant and can be removed from GEM struct so we can use the
>>>>>>>> 'credit' to add the dummy page
>>>>>>>> to GEM I will be happy to follow through.
>>>>>>>>
>>>>>>>> P.S Hash table is off the table anyway and we are talking only
>>>>>>>> about linked list here since by prefaulting
>>>>>>>> the entire VA range for a vmf->vma i will be avoiding redundant
>>>>>>>> page faults to same VMA VA range and so
>>>>>>>> don't need to search and reuse an existing dummy page but simply
>>>>>>>> create a new one for each next fault.
>>>>>>>>
>>>>>>>> Andrey
>>> -- 
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C25b079744d6149f8f2d508d8b65825c6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637459836996005995%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=h5dB%2FP90Gt6t6Oxp%2B9BZzk3YH%2BdYUp3hLQ%2B9bhNMOJM%3D&reserved=0 
>>>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fdri-devel&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C25b079744d6149f8f2d508d8b65825c6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637459836996015986%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Y5I5d5g1OIaV5lhmeZpSnM0Y10fTGNW%2Fc2G9O5LPn2g%3D&reserved=0 
>