[Nouveau] Questions on Maxwell 2nd Gen Compute Kernels/Shaders

Ilia Mirkin imirkin at alum.mit.edu
Mon Jul 15 20:55:37 UTC 2019


On Mon, Jul 15, 2019 at 2:34 PM Fernando Sahmkow <fsahmkow27 at gmail.com> wrote:
>
> So we have been busy implementing the compute engine lately but we have discovered a few issues with Compute Shaders. I hope you guys can answer some questions.
>
> 1st How do I determine the size of Compute Shaders/Kernel Local Memory ? In Pipeline shaders the size is included in the header but Compute Kernels don't have a header, so how do I determine how much local memory it uses? In case I can't is there a limit?

>From the header :)

https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nve4_compute.h
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/nvc0/nve4_compute.c#n775

You may also find this interesting:

https://nvidia.github.io/open-gpu-doc/classes/compute/

These docs appeared well after we had already RE'd, I don't think we
ever went back to check if we'd missed anything substantial.

>
> 2nd I backtrack directions for LDG from the constbuffer that stores them. I then use this directions then to compute the adress in my emulated SSBO. For fragment, geometry and vertex shaders I got no problems with this directions. For compute shaders the directions seem to be invalid, I imagine there's a base adress that's added to this directions. Where can I obtain that base adress?

I don't think so. Can you show me an instruction stream that suggests
this? I suspect you're misreading the code. Should work the same way
as everywhere, except there are only 8 constbufs total, and so
sometimes the actual constbuf data is also retrieved with LDG.

>
> 3rd SUATOM instraction CAS is similar to CompareAndSwap except it may add 1 or 2 to the data register on store. How do I know when it adds 1 or 2?

Uhm... huh? CAS = compare and swap. The argument order is different
than the one in the API, as I recall, but there's no funny addition
that I'm aware of.

Now, there is a IADD.PO mode (PO = plus one), which corresponds to
both arguments' neg bits being set, but that's the only such weirdness
I'm aware of.

Cheers,

  -ilia


More information about the Nouveau mailing list