[Mesa-dev] EXTERNAL: Re: Mixing Pixel Shaders and Compute Shaders
Dorrington, Albert
albert.dorrington at lmco.com
Wed Apr 23 07:38:00 PDT 2014
> -----Original Message-----
> From: Tom Stellard>
> On Wed, Apr 23, 2014 at 01:27:11PM +0000, Dorrington, Albert wrote:
<snip>
> > When I try to call clEnqueueReadImage(), after a
> clEnqueueNDRangeKernel(); the clover/aop/transfer.cpp again generates a
> Pixel Shader, which gets integrated into the command stream after the
> Compute Shader entries (so I send up with:
Command Sequence
1 [Initial Configuration]
2 [PixelShader]
Need PS_PARTIAL_FLUSH here <-- this one I think I have with what I added to evergren_init_atom_start_compute_cs()
3 [ComputeShader]
Need CS_PARTIAL_FLUSH here
4 [PixelShader]
Need PS_PARTIAL_FLUSH here
5 [Final Configuration/Cleanup/Wait]
The other two, I've been experimenting trying to add them in various places, but I haevn't seen a change in behavior yet (probably not putting them in the right place yet...)
>
> You also may need to flush the various caches after the pixel shader and
> compute shaders have completed. See r600_flush_emit in
> r600_hw_context.c
That could be why I'm not seeing any changes yet... I have been looking at r600_flush_emit() this morning and experimenting with its use.
> Does the documentation say that setting the RAT bit means it can only be
> used by compute shaders, or have you discovered this from your testing?
> If this is the case, you may have to use a different CB_COLOR# for the image
> when coying it with a pixel shader.
The documentation I'm referring to is "Radeon Evergreen/Northern Islands Acceleration" Rev 1.0 Dated May 24, 2011
Section 8 CB Programming
Section 8.6 Compute Shader
Compute shaders can perform atomic writes ("device reduction operations") to memory via the CB. The order of
execution of the operations is not guaranteed, only that they are atomic. These writes can include simple operations
(min, max, add, and, or, exchange, compare-exchange) and can optionally return a value (pre-op) back to the shader.
The CF_export adds two new opcodes for RAT exports: EXPORT_RAT and EXPORT_RAT_CACHELESS.
If CB_COLOR<mrt>_INFO.RAT is programmed, the surface is treated as a Random Access Target and can only be
drawn by Compute Shader operations. A set of MRTs can be configured for RATs and normal rendering. The only
stipulation is that all RAT MRTs must be assigned to higher number MRTs than normal rendering MRTs.
I take the statement in the 3rd paragraph to mean that if the RAT bit is set, that a CB setup within a Compute Shader
will not work in a Pixel Shader. However, looking at the command stream executed, the 'r600_draw_vbo()' function
that gets called appears to reconfigure all of the Color buffers, so I don't think this is the issue causing the conflict.
I'm getting more convinced that the issue is that the Compute Shader needs to run completely, before the Pixel Shader runs.
(presumably the need for the CS_PARTIAL_FLUSH and PS_PARTIAL_FLUSH directives)
Frustrating thing is, sometimes DRM can recover from these stalls, other times my box locks up.
> >
>
> Looking forward to your contribution. As always the sooner you can post the
> code the better as that will make it easier to review and may help uncover
> some of your issues.
>
I'd love to contribute some of what I've done back soon, unfortunately with our teams choice to use the AMD SDK OpenCL compiler rather than LLVM, I can't easily migrate my changes back to the Mesa baseline. We have quite a few work-arounds (hacks? haha) to setup the RATs and Vertex Buffers the way the AMD compiler needs them. Once the crazy hours at work slowdown, I'll be able to have some more free time at home where i can contribute. :)
After I reboot the hung box (yet again) I'll be taking another look to verify if the partial flushes are in the command stream or not.
Thanks!
More information about the mesa-dev
mailing list