CONFIG_DMA_CMA causes ttm performance problems/hangs.

Michel Dänzer michel at
Tue Aug 12 18:50:25 PDT 2014

On 12.08.2014 00:17, Jerome Glisse wrote:
> On Mon, Aug 11, 2014 at 12:11:21PM +0200, Thomas Hellstrom wrote:
>> On 08/10/2014 08:02 PM, Mario Kleiner wrote:
>>> On 08/10/2014 01:03 PM, Thomas Hellstrom wrote:
>>>> On 08/10/2014 05:11 AM, Mario Kleiner wrote:
>>>>> The other problem is that probably TTM does not reuse pages from the
>>>>> DMA pool. If i trace the __ttm_dma_alloc_page
>>>>> <>
>>>>> and
>>>>> __ttm_dma_free_page
>>>>> <>
>>>>> calls for
>>>>> those single page allocs/frees, then over a 20 second interval of
>>>>> tracing and switching tabs in firefox, scrolling things around etc. i
>>>>> find about as many alloc's as i find free's, e.g., 1607 allocs vs.
>>>>> 1648 frees.
>>>> This is because historically the pools have been designed to keep only
>>>> pages with nonstandard caching attributes since changing page caching
>>>> attributes have been very slow but the kernel page allocators have been
>>>> reasonably fast.
>>>> /Thomas
>>> Ok. A bit more ftraceing showed my hang problem case goes through the
>>> "if (is_cached)" paths, so the pool doesn't recycle anything and i see
>>> it bouncing up and down by 4 pages all the time.
>>> But for the non-cached case, which i don't hit with my problem, could
>>> one of you look at line 954...
>>> ... and tell me why that unconditional npages = count; assignment
>>> makes sense? It seems to essentially disable all recycling for the dma
>>> pool whenever the pool isn't filled up to/beyond its maximum with free
>>> pages? When the pool is filled up, lots of stuff is recycled, but when
>>> it is already somewhat below capacity, it gets "punished" by not
>>> getting refilled? I'd just like to understand the logic behind that line.
>>> thanks,
>>> -mario
>> I'll happily forward that question to Konrad who wrote the code (or it
>> may even stem from the ordinary page pool code which IIRC has Dave
>> Airlie / Jerome Glisse as authors)
> This is effectively bogus code, i now wonder how it came to stay alive.
> Attached patch will fix that.

I haven't tested Mario's scenario specifically, but it survived piglit
and the UE4 Effects Cave Demo (for which 1GB of VRAM isn't enough, so
some BOs ended up in GTT instead with write-combined CPU mappings) on
radeonsi without any noticeable issues.

Tested-by: Michel Dänzer <michel.daenzer at>

Earthling Michel Dänzer            |        
Libre software enthusiast          |                Mesa and X developer

More information about the dri-devel mailing list