[Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

Wed Jan 20 13:04:53 PST 2010

Luca Barbieri wrote:
>> Also note that the delayed delete list is not in fence order but in
>> deletion-time order, which perhaps gives room for more optimizations.
>>     
> You are right.
> I think then that ttm_bo_delayed_delete may still need to be changed,
> because it stops when ttm_bo_cleanup_refs returns -EBUSY, which
> happens when a fence has not been reached.
> This means that a buffer will need to wait for all previously deleted
> buffers to become unused, even if it is unused itself.
> Is this acceptable?
>   

Yes, I think it's acceptable if you view it in the context that the most 
important buffer resources (GPU memory space and physical system memory) 
are immediately reclaimable through the eviction- and swapping mechanisms.

> What if we get rid of the delayed destroy list, and instead append
> buffers to be deleted to their fence object, and delete them when the
> fence is signaled?
>
> This also allows to do it more naturally, since the fence object can
> just keep a normal reference to the buffers it fences, and unreference
> them on expiration.
>
> Then there needs to be no special "delayed destruction" logic, and it
> would work as if the GPU were keeping a reference to the buffer
> itself, using fences as a proxy to have the CPU do that work for the
> GPU.
>
> Then the delayed work is no longer "periodically destroy buffers" but
> rather "periodically check if fences are expired", naturally stopping
> at the first unexpired one.
> Drivers that support IRQs on fences could also do the work in the
> interrupt handler/tasklet instead, avoid the delay jiffies magic
> number. This may need a NAPI-like interrupt mitigation middle layer
> for optimal results though.
>
>   

Yes, I think that this way, it should definitely be possible to find a 
more optimal solution. One should keep in mind, however, that we'll 
probably not able to destroy buffers from within an atomic context, 
which means we have to schedule a workqueue to do that task. We had to 
do a similar thing in the Poulsbo driver and it turned out that we could 
save a significant amount of CPU by using a delayed workqueue, 
collecting objects and destroying them periodically.

/Thomas