[PATCH 2/2] drm/vram-helper: Alternate between bottom-up and top-down placement

Fri Apr 24 06:59:09 UTC 2020

Hi Gerd

Am 23.04.20 um 15:57 schrieb Gerd Hoffmann:
>>> I don't think it is that simple.
>>>
>>> First:  How will that interact with cursor bo allocations?  IIRC the
>>> strategy for them is to allocate top-down, for similar reasons (avoid
>>> small cursor bo allocs fragment vram memory).
>>
>> In ast, 2 cursor BOs are allocated during driver initialization and kept
>> permanently at the vram's top end. I don't know about other drivers.
> 
> One-time allocation at init time shouldn't be a problem.
> 
>> But cursor BOs are small, so they don't make much of a difference. What
>> is needed is space for 2 primary framebuffers during pageflips, with one
>> of them pinned. The other framebuffer can be located anywhere.
> 
> The problem isn't the size.  The problem is dynamically allocated cursor
> BOs can also fragment vram, especially if top-bottom allocation is also
> used for large framebuffers so cursor BOs can end up somewhere in the
> middle of vram.

But the problem is the size. Cursor BOs are unrelated. Of the
vram-helper-based drivers, only ast and mgag200 use cursors, and both
perma-pin them to the end of vram.

> 
>>> Second:  I think ttm will move bo's from vram to system only on memory
>>> pressure.  So you can still end up with fragmented memory.  To make the
>>> scheme with one fb @ top and one @ bottom work reliable you have to be
>>> more aggressive on pushing out framebuffers.
>>
>> I'm the process of converting mgag200 to atomic modesetting. The given
>> example is what I observed. I'm not claiming that the placement scheme
>> is perfect, but it is required to get mgag200 working with atomic
>> modesetting's pageflip logic. So we're solving a real problem here.
> 
> I don't doubt this is a real problem.
> 
>> The bug comes from Weston's allocation strategy. Looking at the debug
>> output:
>>
>>>>   0x0000000000000000-0x000000000000057f: 1407: free
>>
>> This was fbdev's framebuffer with 1600x900 at 32bpp
>>
>>>>   0x000000000000057f-0x0000000000000b5b: 1500: used
>>
>> This is Weston's framebuffer also with 1600x900 at 32bpp. But Weston
>> allocates an additional, unused 60 scanlines. That is to render with
>> tiles of 64x64px, I suppose. fbdev doesn't do that, hence Weston's
>> second framebuffer doesn't fit into the free location of the fbdev
>> framebuffer.
> 
> Sure.  Assume there is just enough vram to fit in fbdev and two weston
> framebuffers.  fbdev is allocated from bottom, first weston fb from top,
> second weston fb from bottom again.  fbdev is not pushed out (no memory
> pressure yet) so the second weston fb ends up in the middle of vram
> fragmenting things.  And now you are again in the situation where you
> might have enough free vram for an allocation but can't use it due to
> fragmention (probably harder to trigger in practice though).

Not quite. Framebuffer BOs of the current or smaller size will fit into
vram. It's only a problem if the size of the new framebuffer BO
increases. And that is exactly what currently happens with mgag200.

That aside, it's not a fair point, as you constructed an example that no
memory manager can resolve.

> 
> That's why I would suggest to explicitly move out unpinned framebuffers
> (aka fbdev) before pinning a new one (second weston fb) instead of
> depending on ttm moving things out on OOM, to make sure you never
> allocate something in the middle of vram.

We cannot do that. Evicting BOs from vram involves an unmap operation.
We did that in an earlier version of the code and received reports about
performance regressions and CPU cycles in TLB update.

So we added the lazy-unmap feature, where BOs are only unmapped and
evicted when necessary. I think it was even you who suggested this idea. :)

> 
>>> Third:  I'd suggest make topdown allocations depending on current state
>>> instead of simply alternating, i.e. if there is a pinned framebuffer @
>>> offset 0, then go for top-down.
>>
>> That's what the current patch does. If the last pin was at the bottom,
>> the next goes to the top. And then the other way around. Without
>> alternating between both end of vram, the problem would occur again when
>> fragmentation happens near the top end.
> 
> I'd feel better when checking the state of my current pins to figure
> whenever I should alloc top-bottom or not, for robustness reasons.

I don't understand why this is more robust. For example, if you pin a
larger number of BOs, and than evict every other BO, there will always
be free areas among the remaining BOs. All that changes is the location
of those areas.

For a strategy, one cannot look at the BO size alone. In my initial
example, BOs are ~6 MiB and vram is 16 MiB. So any strategy ala 'choose
top-bottom for BOs >1/3 of vram' selects top-bottom for all framebuffer
BOs. This would result in the same OOM, but from top to bottom.

At some point one has to choose to switch to top-down, and then back
again at one of the next BOs. So the current patch effectively splits
vram into a lower half and an upper half and puts BOs in alternating halves.

Best regards
Thomas

> 
>> Looking again at the vram helpers, this functionality could be
>> implemented in drm_gem_vram_plane_helper_prepare_fb(). Drivers with
>> other placement strategies could implement their own helper for prepare_fb.
> 
> vram helpers could also simply offer two prepare_fb variants.
> 
> cheers,
>   Gerd
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20200424/c01f70ac/attachment.sig>