[PATCH 1/2] drm/ttm: add a way to bo_wait for either the last read or last write
thomas at shipmail.org
Tue Oct 4 22:54:57 PDT 2011
On 10/05/2011 04:08 AM, Marek Olšák wrote:
> On Tue, Oct 4, 2011 at 1:48 PM, Thomas Hellstrom<thomas at shipmail.org> wrote:
>> Bah, I totally missed this patch and thus never reviewed it :( Probably on
>> There are a couple of things I'd like to point out.
>> 1) The bo subsystem may never assume that fence objects are ordered, so that
>> when we unref
>> bo::sync_obj, we may never assume that previously attached fence objects are
>> signaled and can be unref'd
>> Think for example fence objects submitted to different command streams. This
>> is a bug and must be fixed.
> If what you say is true, then even the original sync_obj can't be
> trusted. What if I overwrite sync_obj with a new one and the new one
> is signalled sooner than the old one?
The driver validation code will in effect overwrite the old with a new
one, because the driver validation code knows what sync objects are
ordered. If, during validation of a buffer object, the driver validation
code detects that the buffer is already fenced with a sync object that
will signal out-of-order, the driver validation code needs to *wait* for
that sync object to signal before proceeding, or insert a sync object
barrier in the command stream.
The TTM bo code doesn't know how to order fences, and never assumes that
they are ordered.
>> We can detach fence objects from buffers in the driver validation code,
>> because that code knows whether fences are implicitly ordered, or can order
>> them either by inserting a barrier (semaphore in NV languange) or waiting
> I am not sure I follow you here. ttm_bo_wait needs the fences...
> unless we want to move the fences out of TTM into drivers.
Please see the above explanation.
>> for the fence to expire. (For example if the new validation is READ and the
>> fence currently attached is WRITE, we might need to schedule a gpu cache
>> flush before detaching the write fence).
> I am not sure what fences have to do with flushing. Write caches
> should be flushed automatically when resources are unbound. When a
> resource is used for write and read at the same time, it's not our
> problem: the user is responsible for flushing (e.g. through memory and
> texture barriers in OpenGL), not the driver.
How flushing is done is up to the driver writer, (fences is an excellent
tool to do it in an efficient way), but barriers like the write-read
barrier example above may need to be inserted for various reasons. Let's
say you use render-to-texture, unbind the texture from the fbo, and then
want to texture from it. At some point you *need* to flush if you have a
write cache, and that flush needs to happen when you remove the write
fence from the buffer, in order to replace it with a read fence, since
after that the information that the buffer has been written to is gone.
IIRC nouveau uses barriers like this to order fences from different
command streams, Unichrome used it to order fences from different
In any case, I'm not saying fences is the best way to flush but since
the bo code assumes that signaling a sync object means "make the buffer
contents available for CPU read / write", it's usually a good way to do
it; there's even a sync_obj_flush() method that gets called when a
potential flush needs to happen.
>> 2) Can't we say that a write_sync_obj is simply a sync_obj? What's the
>> difference between those two? I think we should remove the write_sync_obj bo
> Okay, but I think we should remove sync_obj instead, and keep write
> and read sync objs. In the case of READWRITE usage, read_sync_obj
> would be equal to write_sync_obj.
Sure, I'm fine with that.
One other thing, though, that makes me a little puzzled:
Let's assume you don't allow readers and writers at the same time, which
is my perception of how read- and write fences should work; you either
have a list of read fences or a single write fence (in the same way a
read-write lock works).
Now, if you only allow a single read fence, like in this patch. That
implies that you can only have either a single read fence or a single
write fence at any one time. We'd only need a single fence pointer on
the bo, and sync_obj_arg would tell us whether to signal the fence for
read or for write (assuming that sync_obj_arg was set up to indicate
read / write at validation time), then the patch really isn't necessary
at all, as it only allows a single read fence?
Or is it that you want to allow read- and write fences co-existing? In
that case, what's the use case?
More information about the dri-devel