[PATCH 1/2] drm/ttm: add a way to bo_wait for either the last read or last write
thomas at shipmail.org
Tue Oct 4 04:48:06 PDT 2011
On 08/07/2011 10:39 PM, Marek Olšák wrote:
> Sometimes we want to know whether a buffer is busy and wait for it (bo_wait).
> However, sometimes it would be more useful to be able to query whether
> a buffer is busy and being either read or written, and wait until it's stopped
> being either read or written. The point of this is to be able to avoid
> unnecessary waiting, e.g. if a GPU has written something to a buffer and is now
> reading that buffer, and a CPU wants to map that buffer for read, it needs to
> only wait for the last write. If there were no write, there wouldn't be any
> waiting needed.
> This, or course, requires user space drivers to send read/write flags
> with each relocation (like we have read/write domains in radeon, so we can
> actually use those for something useful now).
> Now how this patch works:
> The read/write flags should passed to ttm_validate_buffer. TTM maintains
> separate sync objects of the last read and write for each buffer, in addition
> to the sync object of the last use of a buffer. ttm_bo_wait then operates
> with one the sync objects.
> Signed-off-by: Marek Olšák<maraeo at gmail.com>
Bah, I totally missed this patch and thus never reviewed it :( Probably
There are a couple of things I'd like to point out.
1) The bo subsystem may never assume that fence objects are ordered, so
that when we unref
bo::sync_obj, we may never assume that previously attached fence objects
are signaled and can be unref'd
Think for example fence objects submitted to different command streams.
This is a bug and must be fixed.
We can detach fence objects from buffers in the driver validation code,
because that code knows whether fences are implicitly ordered, or can
order them either by inserting a barrier (semaphore in NV languange) or
waiting for the fence to expire. (For example if the new validation is
READ and the fence currently attached is WRITE, we might need to
schedule a gpu cache flush before detaching the write fence).
2) Can't we say that a write_sync_obj is simply a sync_obj? What's the
difference between those two? I think we should remove the
write_sync_obj bo member.
3) Ideally we should have a linked list of read sync objects, with their
own sync_obj_arg, but since there apparently aren't any consumers yet,
we could wait with that.
More information about the dri-devel