[RFC] drm/ttm: add minimum residency constraint for bo eviction

Thu Nov 29 12:40:31 PST 2012

On 11/29/2012 08:20 PM, Marek Olšák wrote:
> On Thu, Nov 29, 2012 at 10:18 AM, Thomas Hellstrom <thomas at shipmail.org> wrote:
>> On 11/28/2012 10:51 PM, Marek Olšák wrote:
>>> I think the problem with Radeon/TTM is much deeper. Let me demonstrate
>>> it on the following example.
>>>
>>> Unigine Heaven needs about 385MB of space for static resources, that's
>>> only 75% of my 512MB card. Yet, TTM is not capable of getting all of
>>> that into VRAM. If I allow GTT placements, I get 20 fps, which is the
>>> old Mesa behavior. If I force VRAM placements, I get 3 fps, because we
>>> validate buffers 10 times per frame and there's probably a lot of
>>> buffer evictions during each validation.
>>>
>> Marek,
>> Did you look at the total amount of referenced buffers in the ring including
>> vertex buffers?
>>
>> Depending on how hard you throttle, I guess vertex / index buffer data
>> referenced by the
>> ring commands may well exceed the VRAM limitation.
> Buffers (not textures) take only 30 MB. These are stats for 1 frame of
> Unigine Heaven. Each line is a CS ioctl.
>
> VRAM [used in CS] / [total allocated], GTT [used in CS] / [total allocated]
> 1. VRAM: 171 / 390 MB, GTT:   1 /   5 MB
> 2. VRAM: 144 / 390 MB, GTT:   2 /   5 MB
> 3. VRAM: 184 / 390 MB, GTT:   1 /   5 MB
> 4. VRAM:  35 / 390 MB, GTT:   2 /   5 MB
> 5. VRAM: 119 / 390 MB, GTT:   1 /   5 MB
> 6. VRAM: 207 / 390 MB, GTT:   1 /   5 MB
> 7. VRAM:  65 / 390 MB, GTT:   2 /   5 MB
>
> If I move all buffers (vertex, index, constant, streamout, queries,
> shader code, etc.) to GTT, this is how one frame looks like (not the
> same one though, but it's close):
>
> 1. VRAM: 144 / 359 MB, GTT:  16 /  35 MB
> 2. VRAM:  95 / 359 MB, GTT:  12 /  35 MB
> 3. VRAM: 178 / 359 MB, GTT:  15 /  35 MB
> 4. VRAM:  55 / 359 MB, GTT:  13 /  35 MB
> 5. VRAM:  22 / 359 MB, GTT:  16 /  35 MB
> 6. VRAM: 163 / 359 MB, GTT:  16 /  35 MB
> 7. VRAM: 133 / 359 MB, GTT:  11 /  35 MB
> 8. VRAM:  66 / 359 MB, GTT:   4 /  35 MB
>
> The stats are generated in the Mesa driver based on the driver's
> expectations where buffers should be placed.
>
> I can easily see how VRAM is thrashed with the strict LRU approach.
>
> Also, is it possible that one buffer is moved twice for a single CS
> ioctl? Imagine there's a buffer at the end of the relocation list,
> which is also at the head of the LRU list. Some buffer in the middle
> causes eviction of the last buffer. When the last buffer is validated,
> it's moved back to VRAM. Can it happen?

No. Typically that shouldn't happen. In a typical CS sequence, first all 
buffers are reserved, and then all buffers
are validated. Reservation takes them off the LRU list, I'm not 100% 
sure Radeon does it this way, but I think so.

/Thomas