[Mesa-dev] [PATCH 3/3] radeon/winsys: add async dma infrastructure

Mon Jan 7 06:45:02 PST 2013

On 07.01.2013 01:24, Marek Olšák wrote:
> On Sun, Jan 6, 2013 at 11:58 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
>> On Sun, Jan 6, 2013 at 4:00 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>> I agree with Christian. You can use a separate instance of
>>> radeon_winsys_cs for the DMA CS. The winsys exposes all the functions
>>> you need (except one) for you to coordinate work between 2 command
>>> streams in the pipe driver. You may only need to expose one additional
>>> winsys function to the driver for synchronization, it's called
>>> "radeon_drm_cs_sync_flush". I'm confident that this can be implemented
>>> and layered on top of the winsys, presumably with fewer lines of code
>>> and cleaner.
>> The relocation add function need to access both the dma ring and the
>> cs ring no matter on which ring the relocation is added. Doing the
>> sync in the pipe driver would increase the code, each call site of
>> add_reloc would need to check if the bo is referenced by the other
>> ring and flush the other ring if so. Which also means that there is a
>> higher likelyhood that someone adding an add reloc forget about the
>> flushing.
> Well, in that case, you can define a new set of functions in the pipe
> driver, which are layered on top of radeon_winsys_cs and the existing
> interface radeon_winsys::cs_*.
>
> If you want to be super clean, you can add a new module that defines
> this command stream pair:
>
> struct r600_cs_with_dma {
>     struct radeon_winsys_cs *cs_main, *cs_dma;
> };
>
> And define a set of functions which work with that, reimplementing all
> the cs_* functions by calling the existing functions of radeon_winsys.
> The pipe driver would then use the new CS functions everywhere instead
> of radeon_winsys.
>
> To me, the best design decision here is not to try to *hack* the
> existing winsys code to make it do what you want without giving it
> another thought. Adding another layer is preferable, because it keeps
> both parts simple and separated.

Well thinking about it more and more I don't think add_reloc is the 
right place to do the sync anyway.

Imagine a loop that wants to handle a bunch of buffers, first they are 
zero cleared and then rendered to. Those buffers are unique, so we can 
zero clear them all at once. In an ideal world they should all end up in 
the same DMA command stream.

Now comes a buffer that is first rendered to and then copied around (for 
example), in this moment the DMA command stream needs to be flushed, 
cause now a new DMA command stream starts that actually needs to run 
after the rendering command stream.

So instead of flushing when we see that a buffer gets added to a command 
stream we need to remember in which oder the command stream needs to get 
submitted and only flush when this order is going to change.

Christian.

>
> Marek
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev