[PATCH v3 01/12] drm: Add dummy page per device or GEM object
Christian König
christian.koenig at amd.com
Fri Jan 8 14:33:34 UTC 2021
Am 08.01.21 um 15:26 schrieb Andrey Grodzovsky:
> Hey Christian, just a ping.
Was there any question for me here?
As far as I can see the best approach would still be to fill the VMA
with a single dummy page and avoid pointers in the GEM object.
Christian.
>
> Andrey
>
> On 1/7/21 11:37 AM, Andrey Grodzovsky wrote:
>>
>> On 1/7/21 11:30 AM, Daniel Vetter wrote:
>>> On Thu, Jan 07, 2021 at 11:26:52AM -0500, Andrey Grodzovsky wrote:
>>>> On 1/7/21 11:21 AM, Daniel Vetter wrote:
>>>>> On Tue, Jan 05, 2021 at 04:04:16PM -0500, Andrey Grodzovsky wrote:
>>>>>> On 11/23/20 3:01 AM, Christian König wrote:
>>>>>>> Am 23.11.20 um 05:54 schrieb Andrey Grodzovsky:
>>>>>>>> On 11/21/20 9:15 AM, Christian König wrote:
>>>>>>>>> Am 21.11.20 um 06:21 schrieb Andrey Grodzovsky:
>>>>>>>>>> Will be used to reroute CPU mapped BO's page faults once
>>>>>>>>>> device is removed.
>>>>>>>>> Uff, one page for each exported DMA-buf? That's not something
>>>>>>>>> we can do.
>>>>>>>>>
>>>>>>>>> We need to find a different approach here.
>>>>>>>>>
>>>>>>>>> Can't we call alloc_page() on each fault and link them together
>>>>>>>>> so they are freed when the device is finally reaped?
>>>>>>>> For sure better to optimize and allocate on demand when we reach
>>>>>>>> this corner case, but why the linking ?
>>>>>>>> Shouldn't drm_prime_gem_destroy be good enough place to free ?
>>>>>>> I want to avoid keeping the page in the GEM object.
>>>>>>>
>>>>>>> What we can do is to allocate a page on demand for each fault
>>>>>>> and link
>>>>>>> the together in the bdev instead.
>>>>>>>
>>>>>>> And when the bdev is then finally destroyed after the last
>>>>>>> application
>>>>>>> closed we can finally release all of them.
>>>>>>>
>>>>>>> Christian.
>>>>>> Hey, started to implement this and then realized that by
>>>>>> allocating a page
>>>>>> for each fault indiscriminately
>>>>>> we will be allocating a new page for each faulting virtual
>>>>>> address within a
>>>>>> VA range belonging the same BO
>>>>>> and this is obviously too much and not the intention. Should I
>>>>>> instead use
>>>>>> let's say a hashtable with the hash
>>>>>> key being faulting BO address to actually keep allocating and
>>>>>> reusing same
>>>>>> dummy zero page per GEM BO
>>>>>> (or for that matter DRM file object address for non imported BOs) ?
>>>>> Why do we need a hashtable? All the sw structures to track this
>>>>> should
>>>>> still be around:
>>>>> - if gem_bo->dma_buf is set the buffer is currently exported as a
>>>>> dma-buf,
>>>>> so defensively allocate a per-bo page
>>>>> - otherwise allocate a per-file page
>>>>
>>>> That exactly what we have in current implementation
>>>>
>>>>
>>>>> Or is the idea to save the struct page * pointer? That feels a bit
>>>>> like
>>>>> over-optimizing stuff. Better to have a simple implementation
>>>>> first and
>>>>> then tune it if (and only if) any part of it becomes a problem for
>>>>> normal
>>>>> usage.
>>>>
>>>> Exactly - the idea is to avoid adding extra pointer to drm_gem_object,
>>>> Christian suggested to instead keep a linked list of dummy pages to be
>>>> allocated on demand once we hit a vm_fault. I will then also
>>>> prefault the entire
>>>> VA range from vma->vm_end - vma->vm_start to vma->vm_end and map them
>>>> to that single dummy page.
>>> This strongly feels like premature optimization. If you're worried
>>> about
>>> the overhead on amdgpu, pay down the debt by removing one of the
>>> redundant
>>> pointers between gem and ttm bo structs (I think we still have some)
>>> :-)
>>>
>>> Until we've nuked these easy&obvious ones we shouldn't play "avoid 1
>>> pointer just because" games with hashtables.
>>> -Daniel
>>
>>
>> Well, if you and Christian can agree on this approach and suggest
>> maybe what pointer is
>> redundant and can be removed from GEM struct so we can use the
>> 'credit' to add the dummy page
>> to GEM I will be happy to follow through.
>>
>> P.S Hash table is off the table anyway and we are talking only about
>> linked list here since by prefaulting
>> the entire VA range for a vmf->vma i will be avoiding redundant page
>> faults to same VMA VA range and so
>> don't need to search and reuse an existing dummy page but simply
>> create a new one for each next fault.
>>
>> Andrey
More information about the amd-gfx
mailing list