[PATCH 1/1] [RFC] drm/ttm: Don't init dma32_zone on 64-bit systems

Christian König ckoenig.leichtzumerken at gmail.com
Mon Feb 18 17:07:12 UTC 2019


Am 18.02.19 um 10:47 schrieb Thomas Hellstrom:
> On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote:
>> Another good question is also why the heck the acc_size counts
>> towards
>> the DMA32 zone?
> DMA32 TTM pages are accounted in the DMA32 zone. Other pages are not.

Yeah, I'm perfectly aware of this. But this is for the accounting size!

We have an accounting for the stuff needed additional to the pages 
backing the BO (e.g. the page and DMA addr array).

And from the bug description it sounds like we use the DMA32 zone for 
this accounting which of course is completely nonsense.

Christian.

>
> For small persistent allocations using ttm_mem_global_alloc(), they are
> accounted also in the DMA32 zone, which may cause over-accounting of
> that zone, but that's pretty unlikely to be a big problem..
>
> /Thomas
>
>
>
>
>
>> In other words why should the internal bookkeeping pages be allocated
>> in
>> the DMA32 zone?
>>
>> That doesn't sounds valid to me in any way,
>> Christian.
>>
>> Am 18.02.19 um 09:02 schrieb Thomas Hellstrom:
>>> Hmm,
>>>
>>> This zone was intended to stop TTM page allocations from
>>> exhausting
>>> the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by default,
>>> which
>>> means if we drop this check, other devices may stop functioning
>>> unexpectedly?
>>>
>>> However, in the end I'd expect the kernel page allocation system
>>> to
>>> make sure there are some pages left in the DMA32 zone, otherwise
>>> random non-IO page allocations would also potentially exhaust the
>>> DMA32 zone without anybody caring, which means removing this zone
>>> wouldn't be any worse than whatever other subsystems may be doing
>>> already...
>>>
>>> /Thomas
>>>
>>>
>>> On 2/16/19 12:02 AM, Kuehling, Felix wrote:
>>>> This is an RFC. I'm not sure this is the right solution, but it
>>>> highlights the problem I'm trying to solve.
>>>>
>>>> The dma32_zone limits the acc_size of all allocated BOs to 2GB.
>>>> On a
>>>> 64-bit system with hundreds of GB of system memory and GPU
>>>> memory,
>>>> this can become a bottle neck. We're seeing TTM memory allocation
>>>> failures not because we're truly out of memory, but because we're
>>>> out of space in the dma32_zone for the acc_size needed for our BO
>>>> book-keeping.
>>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>>> CC: thellstrom at vmware.com
>>>> CC: christian.koenig at amd.com
>>>> ---
>>>>    drivers/gpu/drm/ttm/ttm_memory.c | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_memory.c
>>>> b/drivers/gpu/drm/ttm/ttm_memory.c
>>>> index f1567c3..bb05365 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_memory.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_memory.c
>>>> @@ -363,7 +363,7 @@ static int ttm_mem_init_highmem_zone(struct
>>>> ttm_mem_global *glob,
>>>>        glob->zones[glob->num_zones++] = zone;
>>>>        return 0;
>>>>    }
>>>> -#else
>>>> +#elifndef CONFIG_64BIT
>>>>    static int ttm_mem_init_dma32_zone(struct ttm_mem_global *glob,
>>>>                       const struct sysinfo *si)
>>>>    {
>>>> @@ -441,7 +441,7 @@ int ttm_mem_global_init(struct ttm_mem_global
>>>> *glob)
>>>>        ret = ttm_mem_init_highmem_zone(glob, &si);
>>>>        if (unlikely(ret != 0))
>>>>            goto out_no_zone;
>>>> -#else
>>>> +#elifndef CONFIG_64BIT
>>>>        ret = ttm_mem_init_dma32_zone(glob, &si);
>>>>        if (unlikely(ret != 0))
>>>>            goto out_no_zone;
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



More information about the amd-gfx mailing list