[PATCH] drm/ttm: pass buffer object for bind/unbind callback

Sat Nov 19 13:01:05 PST 2011

On 11/19/2011 09:37 PM, Jerome Glisse wrote:
> On Sat, Nov 19, 2011 at 2:46 PM, Thomas Hellstrom<thellstrom at vmware.com>  wrote:
>    
>> On 11/19/2011 07:11 PM, Jerome Glisse wrote:
>>
>> On Sat, Nov 19, 2011 at 12:00 PM, Thomas Hellstrom
>> <thellstrom at vmware.com>  wrote:
>>
>>
>> On 11/19/2011 03:53 PM, Ben Skeggs wrote:
>>
>>
>> On Sat, 2011-11-19 at 11:07 +0100, Thomas Hellstrom wrote:
>>
>>
>>
>> On 11/19/2011 01:26 AM, Ben Skeggs wrote:
>>
>>
>>
>> On Fri, 2011-11-18 at 23:48 +0100, Thomas Hellstrom wrote:
>>
>>
>>
>>
>> On 11/18/2011 06:26 PM, Ben Skeggs wrote:
>>
>>
>>
>>
>> On Fri, 2011-11-18 at 15:30 +0100, Thomas Hellstrom wrote:
>>
>>
>>
>>
>>
>> On 11/18/2011 02:15 PM, Ben Skeggs wrote:
>>
>>
>>
>>
>>
>> On Fri, 2011-11-18 at 08:57 +0100, Thomas Hellstrom wrote:
>>
>>
>>
>>
>>
>>
>> Jerome,
>>
>> I don't like this change for the following reasons
>>
>>
>>
>>
>>
>>
>> -snip-
>>
>>
>>
>>
>>
>>
>>
>> One can take advantage of move notify callback but, there are
>> corner case where bind/unbind might be call without move notify
>> being call (in error path mostly). So to make sure that each
>> virtual address space have a consistent view of wether a buffer
>> object is backed or not by system page it's better to pass the
>> buffer object to bind/unbind.
>>
>>
>>
>>
>>
>>
>> As I discussed previously with Jerome on IRC, I believe the
>> move_notify
>> hook is sufficient.  I fixed a couple of issues back with it back
>> when I
>> implemented support for this in nouveau.
>>
>> Jerome did point out two issues however, which I believe can be
>> solved
>> easily enough.
>>
>> The first was that the error path after move_notify() has been
>> called
>> doesn't reverse the move_notify() too, leaving incorrect page table
>> entries.  This, I think, could easily be solved by swapping bo->mem
>> and
>> mem, and re-calling move_notify() to have the driver reverse
>> whatever it
>> did.
>>
>> The second issue is that apparently move_notify() doesn't get called
>> in
>> the destroy path.  Nouveau's GEM layer takes care of this for our
>> userspace bos, and we don't use any kernel bos that are mapped into
>> a
>> channel's address space so we don't hit it.  This can probably be
>> solved
>> easily enough too I expect.
>>
>> Ben.
>>
>>
>>
>>
>>
>>
>>
>> OK. I understand. Surely if a move_notify is missing somewhere, that
>> can
>> easily be fixed.
>>
>> It might be good if we can outline how the virtual tables are set up.
>> In
>> my world, the following would happen:
>>
>> 1) Pre command submission:
>>
>> a) reserve bo
>> b) call ttm_bo_validate to set placement. move_notify will tear down
>> any
>> existing GPU page tables for the bo.
>> c) Set up GPU page tables.
>>
>>
>>
>>
>>
>> Nouveau doesn't do this exactly.  I wanted to avoid having to check if
>> a
>> bo was bound to a particular address space on every command
>> submission.
>>
>>
>>
>>
>>
>> It perhaps might be a good idea to revisit this?
>> I think using move_notify to set up a new placement before the data has
>> actually been moved is very fragile and not the intended use. That
>> would
>> also save you from having to handle error paths. Also how do you handle
>> swapouts?
>>
>>
>>
>>
>> I don't see how it's fragile, there's only the one error path after that
>> point that needs to undo it.  And, what *is* the expected use then?  I
>> see it as "tell the driver the buffer is moving so it can do whatever is
>> necessary as a result".
>>
>> Swapouts are a missed case though, indeed.
>>
>>
>>
>>
>>
>> A quick check in c) that the virtual map hasn't been torn down by a
>> move_notify and, in that case, rebuild it would probably be next to
>> free. The virtual GPU mapping would then be valid only after validation
>> and while the bo is fenced or pinned.
>>
>>
>>
>>
>> This alone doesn't solve the swapouts issue either unless you're also
>> wanting us to unmap once a buffer becomes unfenced, which would lead to
>> loads of completely unnecessary map/unmaps.
>>
>> Ben.
>>
>>
>>
>>
>> I think you misunderstand me a little.
>> The swapout issue should be handled by calling a move_notify() to kill
>> the virtual GPU mappings.
>>
>> However, when moving data that has page tables pointing to it, one
>> should (IMHO) do:
>>
>> 1) Kill the old page tables
>> 2) Move the object
>> 3) Set up new page tables on demand.
>>
>> This is done by TTM for CPU page tables and I think that should be done
>> also for GPU page tables. move_notify() should be responsible for 1),
>> The driver execbuf for 3), and 3) needs only to be done for the
>> particular ring / fifo which is executing commands that touch the buffer.
>>
>> This leaves a clear and well-defined requirement on TTM:
>> Notify the driver when it needs to kill its GPU page tables.
>>
>>
>>
>> Ok.  This I don't really object to really.  I read the previous mail as
>> that GPU mappings should only exist as long as the buffer is fenced,
>> which would be a ridiculous amount of overhead.
>>
>>
>>
>> I agree. What I meant was that the page tables wouldn't be considered valid
>> when the fence had signaled. However that was actually incorrect because
>> they would actually be valid until the next move_notify(). The intention was
>> never to tear down the mappings once the fence had signaled; that would have
>> been pretty stupid.
>>
>>
>>
>>
>>
>> With the latest patch from jerome:
>> Notify the driver when it needs to rebuild it page tables. Also on error
>> paths, but not for swapins because no driver will probably set up GPU
>> page tables to SYSTEM_MEMORY.
>>
>> This is what I mean with fragile, and I much rather see the other
>> approach.
>>
>> Ben, I didn't fully understand why you didn't want to use that approach.
>> Did you see a significant overhead with it.
>>
>>
>>
>> Now I'm more clear on what you meant, no, it wouldn't be a lot of
>> overhead.  However, move_notify() was never intended for the use you're
>> proposing now, or the new ttm_mem_reg would never have been being passed
>> in as a parameter...
>>
>>
>>
>> I suppose you're right. Still I think this is the right way to go. Since it
>> has the following advantages:
>>
>> 1) TTM doesn't need to care about the driver re-populating its GPU page
>> tables.
>> Since swapin is handled from the tt layer not the bo layer, this makes it a
>> bit easier on us.
>> 2) Transition to page-faulted GPU virtual maps is straightforward and
>> consistent. A non-page-faulting driver sets up the maps at CS time, A
>> pagefaulting driver can set them up directly from an irq handler without
>> reserving, since the bo is properly fenced or pinned when the pagefault
>> happens.
>> 3) A non-page-faulting driver knows at CS time exactly which
>> page-table-entries really do need populating, and can do this more
>> efficiently.
>>
>> So unless there are strong objections I suggest we should go this way.
>>
>> Even if this changes the semantics of the TTM<->  driver interface compared
>> to how Nouveau currently does things, it means that Jerome's current patch
>> is basically correct and doesn't any longer have any assumptions about
>> SYSTEM memory never being used for virtual GPU maps.
>>
>> Thanks,
>> Thomas.
>>
>>
>> I think it's better to let driver choose how it will handle its
>> virtual page table. For radeon  i update in move_notify in order to
>> minimize the number of update. I don't want to track if an object have
>> been updated or not against each page table. Of course this add
>> possibly useless overhead to move notify as we might update page table
>> to many time (bo from vram->gtt->system)
>>
>> I think if move notify is properly call once for each effective move
>> driver can do their own dance behind curtain.
>>
>> Cheers,
>> Jerome
>>
>>
>> Jerome,
>>
>> It's not really what the driver does that is the problem, it's what the
>> driver expects from the driver<->  TTM interface.
>>
>> That's why I'd really like a maintainable interface design before coding
>> patches that makes the interface a set of various workarounds.
>>
>> Enforcing this will also force driver writers to think twice before
>> implementing things in their own way adding another load of ad hoc callbacks
>> to TTM, without a clear specification but with the only purpose to fit a
>> specific driver implementation, so in a way it's our responsibility to
>> future driver writers to try to agree on a simple interface that works well
>> and allows drivers to do an efficient implementation, and that adds a little
>> maintenance burden on TTM.
>>
>> If a future driver writer looks at the Radeon code and replicates it,
>> because he thinks it's state of the art, and then finds out his code breaks
>> because he can't use SYSTEM memory for GPU page tables, or use his own
>> private LRU, or the code breaks with partially populated TTMs and finally
>> finds it's quite inefficient too because it unnecessarily populates page
>> tables, we've failed.
>>
>> This is really the point I'd like to make, but as a side note, I'm not
>> asking you to track each bo against each page table. What I'm asking you is
>> to not populate the page tables in bo_move() but in execbuf / pushbuf. The
>> possibility to track a bo against each page table comes as a nice benefit.
>>
>>
>> /Thomas
>>
>>      
> I don't see the issue with updating page table in move_notify. That's
> what i do for radeon virtual memory, doing it in command buffer ioctl
> is kind of too costly.

Could you describe why it is too costly? I previously asked Ben the same 
thing and he said it didn't come with a substantial performance penalty. 
Why would it be more than checking a single bool saying whether page 
tables had previously been killed or not? This would really help me 
understand why you want to do it from move_notify.

> Note that nouveau also update the page table in
> move_notify. So i think it's the right solution. TTM interface for
> move notify should just state that it will be call whenever a buffer
> change placement. Then we can add a comment on system buffer which can
> see there page disappear in thin air at any time. Note that when
> placement is system we mark all page table as invalid and point them
> to default entry (which is what nouveau is doing too).
>    

Well I've previously listed a number of disadvantages with this, which I 
think are still valid, but I fail to see any advantages? Just proposed 
workarounds for the disadvantages? (FWIW vmwgfx used to have GPU page 
tables pointing to system memory, and I'd say it would be a pretty 
common use-case for drivers that don't have a mappable GART).

/Thomas