[PATCH 1/1] [RFC] drm/ttm: Don't init dma32_zone on 64-bit systems

Kuehling, Felix Felix.Kuehling at amd.com
Tue Feb 19 17:06:42 UTC 2019


On 2019-02-18 3:39 p.m., Thomas Hellstrom wrote:
> On Mon, 2019-02-18 at 18:07 +0100, Christian König wrote:
>> Am 18.02.19 um 10:47 schrieb Thomas Hellstrom:
>>> On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote:
>>>> Another good question is also why the heck the acc_size counts
>>>> towards
>>>> the DMA32 zone?
>>> DMA32 TTM pages are accounted in the DMA32 zone. Other pages are
>>> not.
>> Yeah, I'm perfectly aware of this. But this is for the accounting
>> size!
>>
>> We have an accounting for the stuff needed additional to the pages
>> backing the BO (e.g. the page and DMA addr array).
>>
>> And from the bug description it sounds like we use the DMA32 zone
>> for
>> this accounting which of course is completely nonsense.
> It's actually accounted in all available zones, since it would be
> pretty hard to determine exactly where that memory should be accounted.
> In particular if it's vmalloced. It might be DMA32, it might not. Given
> the objective of stopping malicious user-space from exhausting the
> DMA32 zone it was, at the time the code was written, a reasonable
> approximation. With ever increasing memory sizes, there might be better
> solutions?

As far as I can see, in TTM, ttm_mem_global_alloc is only used for the 
acc_size in ttm_bo_init_reserved. Other than that vmwgfx also seems to 
use it to account for a few things that are allocated with kmalloc.

So would a better solution be to change ttm_mem_global_alloc to use only 
the kernel zone?

Regards,
   Felix


>
> /Thomas
>
>> Christian.
>>
>>> For small persistent allocations using ttm_mem_global_alloc(), they
>>> are
>>> accounted also in the DMA32 zone, which may cause over-accounting
>>> of
>>> that zone, but that's pretty unlikely to be a big problem..
>>>
>>> /Thomas
>>>
>>>
>>>
>>>
>>>
>>>> In other words why should the internal bookkeeping pages be
>>>> allocated
>>>> in
>>>> the DMA32 zone?
>>>>
>>>> That doesn't sounds valid to me in any way,
>>>> Christian.
>>>>
>>>> Am 18.02.19 um 09:02 schrieb Thomas Hellstrom:
>>>>> Hmm,
>>>>>
>>>>> This zone was intended to stop TTM page allocations from
>>>>> exhausting
>>>>> the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by
>>>>> default,
>>>>> which
>>>>> means if we drop this check, other devices may stop functioning
>>>>> unexpectedly?
>>>>>
>>>>> However, in the end I'd expect the kernel page allocation
>>>>> system
>>>>> to
>>>>> make sure there are some pages left in the DMA32 zone,
>>>>> otherwise
>>>>> random non-IO page allocations would also potentially exhaust
>>>>> the
>>>>> DMA32 zone without anybody caring, which means removing this
>>>>> zone
>>>>> wouldn't be any worse than whatever other subsystems may be
>>>>> doing
>>>>> already...
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>> On 2/16/19 12:02 AM, Kuehling, Felix wrote:
>>>>>> This is an RFC. I'm not sure this is the right solution, but
>>>>>> it
>>>>>> highlights the problem I'm trying to solve.
>>>>>>
>>>>>> The dma32_zone limits the acc_size of all allocated BOs to
>>>>>> 2GB.
>>>>>> On a
>>>>>> 64-bit system with hundreds of GB of system memory and GPU
>>>>>> memory,
>>>>>> this can become a bottle neck. We're seeing TTM memory
>>>>>> allocation
>>>>>> failures not because we're truly out of memory, but because
>>>>>> we're
>>>>>> out of space in the dma32_zone for the acc_size needed for
>>>>>> our BO
>>>>>> book-keeping.
>>>>>>
>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>>>> CC: thellstrom at vmware.com
>>>>>> CC: christian.koenig at amd.com
>>>>>> ---
>>>>>>     drivers/gpu/drm/ttm/ttm_memory.c | 4 ++--
>>>>>>     1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_memory.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_memory.c
>>>>>> index f1567c3..bb05365 100644
>>>>>> --- a/drivers/gpu/drm/ttm/ttm_memory.c
>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_memory.c
>>>>>> @@ -363,7 +363,7 @@ static int
>>>>>> ttm_mem_init_highmem_zone(struct
>>>>>> ttm_mem_global *glob,
>>>>>>         glob->zones[glob->num_zones++] = zone;
>>>>>>         return 0;
>>>>>>     }
>>>>>> -#else
>>>>>> +#elifndef CONFIG_64BIT
>>>>>>     static int ttm_mem_init_dma32_zone(struct ttm_mem_global
>>>>>> *glob,
>>>>>>                        const struct sysinfo *si)
>>>>>>     {
>>>>>> @@ -441,7 +441,7 @@ int ttm_mem_global_init(struct
>>>>>> ttm_mem_global
>>>>>> *glob)
>>>>>>         ret = ttm_mem_init_highmem_zone(glob, &si);
>>>>>>         if (unlikely(ret != 0))
>>>>>>             goto out_no_zone;
>>>>>> -#else
>>>>>> +#elifndef CONFIG_64BIT
>>>>>>         ret = ttm_mem_init_dma32_zone(glob, &si);
>>>>>>         if (unlikely(ret != 0))
>>>>>>             goto out_no_zone;
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cthellstrom%40vmware.com%7C1a84a2a700cd48b3980a08d695c38ade%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636861064581303965&sdata=6FCAP6YbostgKfEEhRKKh9eQSrfXR%2BfCczYB8WwlPVY%3D&reserved=0


More information about the amd-gfx mailing list