[PATCH] drm/ttm: fix adding foreign BOs to the LRU during init

Daniel Vetter daniel at ffwll.ch
Mon Jan 11 11:37:03 PST 2016


On Mon, Jan 11, 2016 at 04:54:19PM +0100, Christian König wrote:
> >Out of curiosity, where do you generate so many foreign dma-buf backes
> >objects?
> That was just an experiment of sharing buffers between two AMD devices. A
> stupid reference count bug and I've leaked 60 full HD frame a second.
> 
> Took a bit over two minutes to eat up 1GB of GTT space and the box dying
> rather spectacularly.

Ah ok, that explains it ;-)
-Daniel

> 
> Regards,
> Christian.
> 
> Am 11.01.2016 um 15:55 schrieb Daniel Vetter:
> >On Mon, Jan 11, 2016 at 03:38:23PM +0100, Christian König wrote:
> >>>While this would be one way to work around the GTT space DOS, it
> >>>wouldn't work around the DOS problem of the exporter pinning system
> >>>pages that probably never get accounted. That's a limitation of the
> >>>export-import mechanism (dma-buf).
> >>Yeah, completely agree. That's a problem we sooner or later need to address
> >>as well.
> >>
> >>But currently I worry mostly about that specific problem at hand and how to
> >>fix it, the larger design problems need to wait.
> >>
> >>Just send five patches out to the list and CCing you as well. You can ignore
> >>the last two which are amdgpu specific, but would be nice if you could take
> >>a look at the first three.
> >Out of curiosity, where do you generate so many foreign dma-buf backes
> >objects? With prime I expect at most a few framebuffers on each side to be
> >shared, and that shouldn't really cause much trouble. Just sharing
> >dma-bufs with your own device should result in the underlying native
> >objects being referenced, so work like flink.
> >
> >Wrt fixing this for real: I think step 1 would be stop pinning dma-bufs on
> >import and instead implementing all the map/unmap vfunc hooks ttm already
> >implements (it's fully intentional that they match really closely
> >between) by virtualizing essentially ttm_bo_api.h.
> >
> >After that we need to allow exporters to evict mappings importers have
> >created (using the ww_mutex locking and all that like ttm does for purely
> >native eviction). That likely needs a callback to dma_buf_attachment. With
> >that we should be 90% there, since most likely when the importer tries to
> >bind more stuff exported from somewhere we'll run into limits of the
> >exporter. It's not perfect since it's not (yet) a truly global evict, but
> >should get us a large step towards that goal.
> >
> >If we really need global eviction across all drivers in a system (for e.g.
> >system memory, nothing else should be shared like that really I think)
> >then I guess we could look into either extending ttm_global (or adding
> >something similar at the dma-buf level).
> >
> >Cheers, Daniel
> >
> >>Thanks in advance,
> >>Christian.
> >>
> >>Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom:
> >>>On 01/11/2016 01:39 PM, Christian König wrote:
> >>>>Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
> >>>>>On 01/09/2016 11:46 AM, Christian König wrote:
> >>>>>>It's correct that exported buffers can't be moved to another domain or
> >>>>>>swapped to disk while another device is using it.
> >>>>>>
> >>>>>>But for imports that's a different story:
> >>>>>>>an imported object should never end up on a LRU list in TTM because
> >>>>>>>TTM wouldn't know how to evict it.
> >>>>>>Even currently the imported objects are actually added to the LRU. The
> >>>>>>problem is just that they are not added immediately during creation,
> >>>>>>but only with the first command submission.
> >>>>>Hmm. The fact that they are added to LRU at all sounds weird. What
> >>>>>happens when TTM tries to evict or even swap out an imported buffer?
> >>>>Adding them to the domain LRU is perfectly normal for the imported
> >>>>buffers and also works fine as far as I know.
> >>>>
> >>>>When TTM evicts them it just gets unmapped from GTT to make room in
> >>>>the address space, which at least for Radeon and Amdgpu is exactly
> >>>>what we want.
> >>>>
> >>>>That imported buffers get added to the swap LRU is indeed nonsense,
> >>>>but not harmful as far as I can see. Going to fix that in a follow up
> >>>>patch.
> >>>>
> >>>So the way TTM was designed when it was written was to be used to place
> >>>data in memory types where it could later be mapped by CPU and GPU in a
> >>>coherent fashion.
> >>>
> >>>The actual GPU mapping was primarily thought to be done by the driver as
> >>>part of the validation process *after* that placement if needed.
> >>>
> >>>Now many (most, all) drivers don't need that second step since the
> >>>memory type is directly mappable by the GPU, but in a situation where
> >>>the core TTM functionality (placement and swapping) is completely
> >>>replaced by another mechanism (such as imported buffers), not
> >>>implementing the GPU binding and eviction in the driver as a separate
> >>>step naturally generates a conflict.
> >>>
> >>>Now I'm not at all against TTM being used for GPU mapping only if that
> >>>simplifies things for imported buffers but in that case IMO one should
> >>>be aware of the limitations and we should find a way to mark those
> >>>buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
> >>>about the placement ad avoid putting it on the swap LRU.
> >>>
> >>>While this would be one way to work around the GTT space DOS, it
> >>>wouldn't work around the DOS problem of the exporter pinning system
> >>>pages that probably never get accounted. That's a limitation of the
> >>>export-import mechanism (dma-buf).
> >>>
> >>>/Thomas
> >>>
> >>>
> >>>
> >>>>Thanks for the comment,
> >>>>Christian.
> >>>>
> >>>>>/Thomas
> >>>>>
> >>>>>
> >>>>>>Regards,
> >>>>>>Christian.
> >>>>>>
> >>>>>>Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
> >>>>>>>Hi!
> >>>>>>>
> >>>>>>>I might be misunderderstanding the use-case here, but IIRC the
> >>>>>>>discussion with TTM vs imported / exported buffer objects is that a
> >>>>>>>buffer object needs to be marked NO_EVICT in TTM before it's exported
> >>>>>>>and an imported object should never end up on a LRU list in TTM
> >>>>>>>because
> >>>>>>>TTM wouldn't know how to evict it.
> >>>>>>>
> >>>>>>>/Thomas
> >>>>>>>   On 01/08/2016 02:41 PM, Christian König wrote:
> >>>>>>>>From: Christian König <christian.koenig at amd.com>
> >>>>>>>>
> >>>>>>>>If we import a BO with an eternal reservation object we don't
> >>>>>>>>reserve/unreserve it. So we never add it to the LRU causing a
> >>>>>>>>possible
> >>>>>>>>deny of service.
> >>>>>>>>
> >>>>>>>>Signed-off-by: Christian König <christian.koenig at amd.com>
> >>>>>>>>---
> >>>>>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
> >>>>>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>>diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>index 745e996..a98a5d5 100644
> >>>>>>>>--- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>+++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>@@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
> >>>>>>>>        if (likely(!ret))
> >>>>>>>>            ret = ttm_bo_validate(bo, placement, interruptible,
> >>>>>>>>false);
> >>>>>>>>    -    if (!resv)
> >>>>>>>>+    if (!resv) {
> >>>>>>>>            ttm_bo_unreserve(bo);
> >>>>>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> >>>>>>>>+        spin_lock(&bo->glob->lru_lock);
> >>>>>>>>+        ttm_bo_add_to_lru(bo);
> >>>>>>>>+        spin_unlock(&bo->glob->lru_lock);
> >>>>>>>>+    }
> >>>>>>>>+
> >>>>>>>>        if (unlikely(ret))
> >>>>>>>>            ttm_bo_unref(&bo);
> >>>
> >>_______________________________________________
> >>dri-devel mailing list
> >>dri-devel at lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the dri-devel mailing list