[PATCH 1/2] drm/ttm: add a way to bo_wait for either the last read or last write

Tue Oct 4 22:54:57 PDT 2011

On 10/05/2011 04:08 AM, Marek Olšák wrote:
> On Tue, Oct 4, 2011 at 1:48 PM, Thomas Hellstrom<thomas at shipmail.org>  wrote:
>    
>> Bah, I totally missed this patch and thus never reviewed it :( Probably on
>> vacation.
>>
>> There are a couple of things I'd like to point out.
>>
>> 1) The bo subsystem may never assume that fence objects are ordered, so that
>> when we unref
>> bo::sync_obj, we may never assume that previously attached fence objects are
>> signaled and can be unref'd
>> Think for example fence objects submitted to different command streams. This
>> is a bug and must be fixed.
>>      
> If what you say is true, then even the original sync_obj can't be
> trusted. What if I overwrite sync_obj with a new one and the new one
> is signalled sooner than the old one?
>    

The driver validation code will in effect overwrite the old with a new 
one, because the driver validation code knows what sync objects are 
ordered. If, during validation of a buffer object, the driver validation 
code detects that the buffer is already fenced with a sync object that 
will signal out-of-order, the driver validation code needs to *wait* for 
that sync object to signal before proceeding, or insert a sync object 
barrier in the command stream.

The TTM bo code doesn't know how to order fences, and never assumes that 
they are ordered.

>
>    
>> We can detach fence objects from buffers in the driver validation code,
>> because that code knows whether fences are implicitly ordered, or can order
>> them either by inserting a barrier (semaphore in NV languange) or waiting
>>      
> I am not sure I follow you here. ttm_bo_wait needs the fences...
> unless we want to move the fences out of TTM into drivers.
>    

Please see the above explanation.

>
>    
>> for the fence to expire. (For example if the new validation is READ and the
>> fence currently attached is WRITE, we might need to schedule a gpu cache
>> flush before detaching the write fence).
>>      
> I am not sure what fences have to do with flushing. Write caches
> should be flushed automatically when resources are unbound. When a
> resource is used for write and read at the same time, it's not our
> problem: the user is responsible for flushing (e.g. through memory and
> texture barriers in OpenGL), not the driver.
>    

How flushing is done is up to the driver writer, (fences is an excellent 
tool to do it in an efficient way), but barriers like the write-read 
barrier example above may need to be inserted for various reasons. Let's 
say you use render-to-texture, unbind the texture from the fbo, and then 
want to texture from it. At some point you *need* to flush if you have a 
write cache, and that flush needs to happen when you remove the write 
fence from the buffer, in order to replace it with a read fence, since 
after that the information that the buffer has been written to is gone. 
IIRC nouveau uses barriers like this to order fences from different 
command streams, Unichrome used it to order fences from different 
hardware engines.

In any case, I'm not saying fences is the best way to flush but since 
the bo code assumes that signaling a sync object means "make the buffer 
contents available for CPU read / write", it's usually a good way to do 
it; there's even a sync_obj_flush() method that gets called when a 
potential flush needs to happen.

>
>    
>> 2) Can't we say that a write_sync_obj is simply a sync_obj? What's the
>> difference between those two? I think we should remove the write_sync_obj bo
>> member.
>>      
> Okay, but I think we should remove sync_obj instead, and keep write
> and read sync objs. In the case of READWRITE usage, read_sync_obj
> would be equal to write_sync_obj.
>
>    

Sure, I'm fine with that.

One other thing, though, that makes me a little puzzled:

Let's assume you don't allow readers and writers at the same time, which 
is my perception of how read- and write fences should work; you either 
have a list of read fences or a single write fence (in the same way a 
read-write lock works).

Now, if you only allow a single read fence, like in this patch. That 
implies that you can only have either a single read fence or a single 
write fence at any one time. We'd only need a single fence pointer on 
the bo, and sync_obj_arg would tell us whether to signal the fence for 
read or for write (assuming that sync_obj_arg was set up to indicate 
read / write at validation time), then the patch really isn't necessary 
at all, as it only allows a single read fence?

Or is it that you want to allow read- and write fences co-existing? In 
that case, what's the use case?

Thanks,
Thomas

> Marek
>