[Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete
thomas at shipmail.org
Wed Jan 20 13:04:53 PST 2010
Luca Barbieri wrote:
>> Also note that the delayed delete list is not in fence order but in
>> deletion-time order, which perhaps gives room for more optimizations.
> You are right.
> I think then that ttm_bo_delayed_delete may still need to be changed,
> because it stops when ttm_bo_cleanup_refs returns -EBUSY, which
> happens when a fence has not been reached.
> This means that a buffer will need to wait for all previously deleted
> buffers to become unused, even if it is unused itself.
> Is this acceptable?
Yes, I think it's acceptable if you view it in the context that the most
important buffer resources (GPU memory space and physical system memory)
are immediately reclaimable through the eviction- and swapping mechanisms.
> What if we get rid of the delayed destroy list, and instead append
> buffers to be deleted to their fence object, and delete them when the
> fence is signaled?
> This also allows to do it more naturally, since the fence object can
> just keep a normal reference to the buffers it fences, and unreference
> them on expiration.
> Then there needs to be no special "delayed destruction" logic, and it
> would work as if the GPU were keeping a reference to the buffer
> itself, using fences as a proxy to have the CPU do that work for the
> Then the delayed work is no longer "periodically destroy buffers" but
> rather "periodically check if fences are expired", naturally stopping
> at the first unexpired one.
> Drivers that support IRQs on fences could also do the work in the
> interrupt handler/tasklet instead, avoid the delay jiffies magic
> number. This may need a NAPI-like interrupt mitigation middle layer
> for optimal results though.
Yes, I think that this way, it should definitely be possible to find a
more optimal solution. One should keep in mind, however, that we'll
probably not able to destroy buffers from within an atomic context,
which means we have to schedule a workqueue to do that task. We had to
do a similar thing in the Poulsbo driver and it turned out that we could
save a significant amount of CPU by using a delayed workqueue,
collecting objects and destroying them periodically.
More information about the Nouveau