[PATCH 1/2] drm/ttm: add a way to bo_wait for either the last read or last write

Sat Oct 8 01:14:43 PDT 2011

On 10/07/2011 11:30 PM, Marek Olšák wrote:
> On Fri, Oct 7, 2011 at 3:38 PM, Jerome Glisse<j.glisse at gmail.com>  wrote:
>    
>> I should have look at the patch long ago ... anyway i think a better
>> approach would be to expose fence id as 64bits unsigned to each
>> userspace client. I was thinking of mapping a page readonly (same page
>> as the one gpu write back) at somespecific offset in drm file (bit
>> like sarea but readonly so no lock). Each time userspace submit a
>> command stream it would get the fence id associated with the command
>> stream. It would then be up to userspace to track btw read or write
>> and do appropriate things. The kernel code would be simple (biggest
>> issue is finding an offset we can use for that), it would be fast as
>> no round trip to kernel to know the last fence.
>>
>> Each fence seq would be valid only for a specific ring (only future
>> gpu impacted here, maybe cayman).
>>
>> So no change to ttm, just change to radeon to return fence seq and to
>> move to an unsigned 64. Issue would be when gpu write back is
>> disabled, then we would either need userspace to call somethings like
>> bo wait or to other ioctl to get the kernel to update the copy, copy
>> would be updated in the irq handler too so at least it get updated
>> anytime something enable irq.
>>      
> I am having a hard time understanding what you are saying.
>
> Anyway, I had some read and write usage tracking in the radeon winsys.
> That worked well for driver-private resources, but it was a total fail
> for the ones shared with the DDX. I needed this bo_wait optimization
> to work with multiple processes. That's the whole point why I am doing
> this in the kernel.
>
> Marek
>    
At one XDS meeting in Cambridge an IMHO questionable decision was taken to
try to keep synchronization operations like this in user-space, 
communicating necessary
info among involved components. In this case you'd then need to send 
fence sequences
down the DRI2 protocol to the winsys.

However, if you at one point want to do user-space suballocation of 
kernel buffers,
that's something you need to do anyway, because the kernel is not aware 
that user-space fences
the suballocations separately.

/Thomas