Random short freezes due to TTM buffer migrations

Thu Aug 18 08:52:18 UTC 2016

In that case my patch should clearly help with that.

Going to release what I have so far today, but looks like I need more 
time actually fixing this.

In the meantime you could try the "drm/amdgpu: fix lru size grouping" 
patch as well. It fixes a bug related to this and could help as well.

Regards,
Christian.

Am 18.08.2016 um 01:39 schrieb Marek Olšák:
> Actually, I was wrong.
>
> The buffers in that app are pretty small. The largest one has 86 MB 
> and others have 52 MB. I must have misread that as 520 MB.
>
> At one point, ttm_bo_validate with a 32 MB buffer moved 971 MB.
>
> Maybe it's just a VRAM fragmentation issue (i.e. a lack of contiguous 
> free memory).
>
> Marek
>
> On Wed, Aug 17, 2016 at 9:19 PM, Christian König 
> <deathsimple at vodafone.de <mailto:deathsimple at vodafone.de>> wrote:
>
>     Sharing buffers between applications is handled by the DRM layer
>     and transparent to the driver.
>
>     E.g. the driver is not even informed if a sharing is done by
>     DMA-buf or GEM flink, it's just another reference to the BO.
>
>     So there isn't any change to that at all.
>
>     Regards,
>     Christian.
>
>
>     Am 17.08.2016 um 21:03 schrieb Felix Kuehling:
>
>         I think the scatter-gather tables only support system memory. As I
>         understand it, a buffer in VRAM has be migrated to system
>         memory before
>         it can be shared with another driver.
>
>         I'm more concerned about sharing with the same driver. There is a
>         special code path for that, where we simply add another
>         reference to the
>         same BO, instead of looking at a scatter gather table. We use
>         that for
>         OpenGL-OpenCL interop, and also planning to use it for IPC buffer
>         sharing in HSA. As long as a split VRAM buffer is still a single
>         amdgpu_bo, and becomes a single dmabuf when exporting it, I
>         think that
>         should work.
>
>         Regards,
>            Felix
>
>
>         On 16-08-17 02:58 AM, Christian König wrote:
>
>                 One question: Will it be possible to share these split
>                 BOs as dmabufs?
>
>             In theory yes, in practice I'm not sure.
>
>             DMA-bufs are designed around scatter gather tables, those
>             fortunately
>             support buffers split over the whole address space.
>
>             The problem is the importing device needs to be able to
>             handle that as
>             well.
>
>             Regards,
>             Christian.
>
>             Am 16.08.2016 um 20:33 schrieb Felix Kuehling:
>
>                 Very nice. I'm looking forward to this for KFD as well.
>
>                 One question: Will it be possible to share these split
>                 BOs as dmabufs?
>
>                 Regards,
>                     Felix
>
>
>                 On 16-08-16 11:27 AM, Christian König wrote:
>
>                     Hi Marek,
>
>                     I'm already working on this.
>
>                     My current approach is to use a custom BO manager
>                     for VRAM with TTM
>                     and so split allocations into chunks of 4MB.
>
>                     Large BOs are still swapped out as one, but it
>                     makes it much more
>                     likely to that you can allocate 1/2 of VRAM as one
>                     buffer.
>
>                     Give me till the end of the week to finish this
>                     and then we can test
>                     if that's sufficient or if we need to do more.
>
>                     Regards,
>                     Christian.
>
>                     Am 16.08.2016 um 16:33 schrieb Marek Olšák:
>
>                         Hi,
>
>                         I'm seeing random temporary freezes (up to 2
>                         seconds) under memory
>                         pressure. Before I describe the exact
>                         circumstances, I'd like to say
>                         that this is a serious issue affecting
>                         playability of certain AAA
>                         Linux games.
>
>                         In order to reproduce this, an application should:
>                         - allocate a few very large buffers (256-512
>                         MB per buffer)
>                         - allocate more memory than there is available
>                         VRAM. The issue also
>                         occurs (but at a lower frequency) if the app
>                         needs only 80% of VRAM.
>
>                         Example: ttm_bo_validate needs to migrate a
>                         512 MB buffer. The total
>                         size of moved memory for that call can be as
>                         high as 1.5 GB. This is
>                         always followed by a big temporary drop in
>                         VRAM usage.
>
>                         The game I'm testing needs 3.4 GB of VRAM.
>
>                         Setups:
>                         Tonga - 2 GB: It's nearly unplayable, because
>                         freezes occur too often.
>                         Fiji - 4 GB: There is one freeze at the
>                         beginning (which is annoying
>                         too), after that it's smooth.
>
>                         So even 4 GB is not enough.
>
>                         Workarounds:
>                         - Split buffers into smaller pieces in the
>                         kernel. It's not necessary
>                         to manage memory at page granularity (64KB).
>                         Splitting buffers into
>                         16MB-large pieces might not be optimal but it
>                         would be a significant
>                         improvement.
>                         - Or do the same in Mesa. This would prevent
>                         inter-process and
>                         inter-API buffer sharing for split buffers
>                         (DRI, OpenCL), but we would
>                         at least verify how much the situation improves.
>
>                         Other issues sharing the same cause:
>                         - Allocations requesting 1/3 or more VRAM have
>                         a high chance of
>                         failing. It's generally not possible to
>                         allocate 1/2 or more VRAM as
>                         one buffer.
>
>                         Comments welcome,
>
>                         Marek
>                         _______________________________________________
>                         amd-gfx mailing list
>                         amd-gfx at lists.freedesktop.org
>                         <mailto:amd-gfx at lists.freedesktop.org>
>                         https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>                         <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>
>                     _______________________________________________
>                     amd-gfx mailing list
>                     amd-gfx at lists.freedesktop.org
>                     <mailto:amd-gfx at lists.freedesktop.org>
>                     https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>                     <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>
>
>         _______________________________________________
>         amd-gfx mailing list
>         amd-gfx at lists.freedesktop.org
>         <mailto:amd-gfx at lists.freedesktop.org>
>         https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>         <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20160818/9cf409dd/attachment-0001.html>