[Mesa-dev] [PATCH 1/4] gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa

Roland Scheidegger sroland at vmware.com
Thu Oct 16 23:19:38 PDT 2014


On 10/16/2014 12:55 PM, Marek Olšák wrote:
> On Thu, Oct 16, 2014 at 8:40 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> On 10/16/2014 08:33 AM, Marek Olšák wrote:
>>>
>>> From: Marek Olšák <marek.olsak at amd.com>
>>>
>>> With 5 shader stages and various combinations of enabled and disabled
>>> shaders,
>>> the maximum number of outputs in one shader doesn't have to be equal to
>>> the maximum number of inputs in the following shader.
>>> ---
>>>    src/gallium/auxiliary/gallivm/lp_bld_limits.h    | 2 ++
>>>    src/gallium/auxiliary/tgsi/tgsi_exec.h           | 2 ++
>>>    src/gallium/docs/source/screen.rst               | 2 ++
>>>    src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>>>    src/gallium/drivers/i915/i915_screen.c           | 2 ++
>>>    src/gallium/drivers/ilo/ilo_screen.c             | 1 +
>>>    src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 3 +++
>>>    src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 2 ++
>>>    src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 2 ++
>>>    src/gallium/drivers/r300/r300_screen.c           | 4 ++++
>>>    src/gallium/drivers/r600/r600_pipe.c             | 2 ++
>>>    src/gallium/drivers/radeonsi/si_pipe.c           | 2 ++
>>>    src/gallium/drivers/svga/svga_screen.c           | 4 ++++
>>>    src/gallium/drivers/vc4/vc4_screen.c             | 2 ++
>>>    src/gallium/include/pipe/p_defines.h             | 1 +
>>>    src/mesa/state_tracker/st_extensions.c           | 8 ++++----
>>>    16 files changed, 36 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_limits.h
>>> b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
>>> index a96ab29..02a645a 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
>>> @@ -97,6 +97,8 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
>>>          return LP_MAX_TGSI_NESTING;
>>>       case PIPE_SHADER_CAP_MAX_INPUTS:
>>>          return PIPE_MAX_SHADER_INPUTS;
>>> +   case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +      return PIPE_MAX_SHADER_OUTPUTS;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>          return sizeof(float[4]) * 4096;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.h
>>> b/src/gallium/auxiliary/tgsi/tgsi_exec.h
>>> index 4720ec6..81a69e7 100644
>>> --- a/src/gallium/auxiliary/tgsi/tgsi_exec.h
>>> +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.h
>>> @@ -426,6 +426,8 @@ tgsi_exec_get_shader_param(enum pipe_shader_cap param)
>>>          return TGSI_EXEC_MAX_NESTING;
>>>       case PIPE_SHADER_CAP_MAX_INPUTS:
>>>          return TGSI_EXEC_MAX_INPUT_ATTRIBS;
>>> +   case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +      return PIPE_MAX_SHADER_OUTPUTS;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>          return TGSI_EXEC_MAX_CONST_BUFFER_SIZE;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/docs/source/screen.rst
>>> b/src/gallium/docs/source/screen.rst
>>> index f4e9204..a43f5dc 100644
>>> --- a/src/gallium/docs/source/screen.rst
>>> +++ b/src/gallium/docs/source/screen.rst
>>> @@ -267,6 +267,8 @@ support different features.
>>>    * ``PIPE_SHADER_CAP_MAX_TEX_INDIRECTIONS``: The maximum number of
>>> texture indirections.
>>>    * ``PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH``: The maximum nested control
>>> flow depth.
>>>    * ``PIPE_SHADER_CAP_MAX_INPUTS``: The maximum number of input registers.
>>> +* ``PIPE_SHADER_CAP_MAX_OUTPUTS``: The maximum number of output
>>> registers.
>>> +  This is valid for all shaders except the fragment shader.
>>
>> This doesn't quite seem to match the code, as you've got code which
>> explicitly returns different numbers for fragment shaders, though only for
>> some drivers?
>
> While it's not really useful for fragment shaders, I always wanted to
> return something sensible. Not that it matters.
>
>>
>>
>>>    * ``PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE``: The maximum size per
>>> constant buffer in bytes.
>>>    * ``PIPE_SHADER_CAP_MAX_CONST_BUFFERS``: Maximum number of constant
>>> buffers that can be bound
>>>      to any shader stage using ``set_constant_buffer``. If 0 or 1, the pipe
>>> will
>>> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c
>>> b/src/gallium/drivers/freedreno/freedreno_screen.c
>>> index 24f360b..40c6c19 100644
>>> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
>>> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
>>> @@ -350,6 +350,7 @@ fd_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>          case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
>>>                  return 8; /* XXX */
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>> +        case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>>                  return 16;
>>>          case PIPE_SHADER_CAP_MAX_TEMPS:
>>>                  return 64; /* Max native temporaries. */
>>> diff --git a/src/gallium/drivers/i915/i915_screen.c
>>> b/src/gallium/drivers/i915/i915_screen.c
>>> index 9006734..2a6e751 100644
>>> --- a/src/gallium/drivers/i915/i915_screen.c
>>> +++ b/src/gallium/drivers/i915/i915_screen.c
>>> @@ -130,6 +130,8 @@ i915_get_shader_param(struct pipe_screen *screen,
>>> unsigned shader, enum pipe_sha
>>>             return 0;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>             return 10;
>>> +      case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +         return 1;
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>             return 32 * sizeof(float[4]);
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/ilo/ilo_screen.c
>>> b/src/gallium/drivers/ilo/ilo_screen.c
>>> index bfd67da..48c3dea 100644
>>> --- a/src/gallium/drivers/ilo/ilo_screen.c
>>> +++ b/src/gallium/drivers/ilo/ilo_screen.c
>>> @@ -121,6 +121,7 @@ ilo_get_shader_param(struct pipe_screen *screen,
>>> unsigned shader,
>>>       case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
>>>          return UINT_MAX;
>>>       case PIPE_SHADER_CAP_MAX_INPUTS:
>>> +   case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>>          /* this is limited by how many attributes SF can remap */
>>>          return 16;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>> index a1373fd..700b9bb 100644
>>> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
>>> @@ -222,6 +222,7 @@ nv30_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>          case PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH:
>>>             return 0;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>> +      case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>>             return 16;
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>             return ((eng3d->oclass >= NV40_3D_CLASS) ? (468 - 6): (256 -
>>> 6)) * sizeof(float[4]);
>>> @@ -258,6 +259,8 @@ nv30_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>             return 0;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>             return 8; /* should be possible to do 10 with nv4x */
>>> +      case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +         return 4;
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>             return ((eng3d->oclass >= NV40_3D_CLASS) ? 224 : 32) *
>>> sizeof(float[4]);
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>>> b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>>> index 3a46e72..d26a438 100644
>>> --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>>> +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c
>>> @@ -253,6 +253,8 @@ nv50_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>          if (shader == PIPE_SHADER_VERTEX)
>>>             return 32;
>>>          return 15;
>>> +   case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +      return 16;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>          return 65536;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> index 3858981..a673eb9 100644
>>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> @@ -259,6 +259,8 @@ nvc0_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>           * and excludes 0x60 per-patch inputs.
>>>           */
>>>          return 0x200 / 16;
>>> +   case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +      return 32;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>          return 65536;
>>>       case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/r300/r300_screen.c
>>> b/src/gallium/drivers/r300/r300_screen.c
>>> index c35559f..db9ad15 100644
>>> --- a/src/gallium/drivers/r300/r300_screen.c
>>> +++ b/src/gallium/drivers/r300/r300_screen.c
>>> @@ -258,6 +258,8 @@ static int r300_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader, e
>>>                 * additional texcoords but there is no two-sided color
>>>                 * selection then. However the facing bit can be used
>>> instead. */
>>>                return 10;
>>> +        case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +            return 4;
>>>            case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>                return (is_r500 ? 256 : 32) * sizeof(float[4]);
>>>            case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> @@ -306,6 +308,8 @@ static int r300_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader, e
>>>                return is_r500 ? 4 : 0; /* For loops; not sure about
>>> conditionals. */
>>>            case PIPE_SHADER_CAP_MAX_INPUTS:
>>>                return 16;
>>> +        case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +            return 10;
>>>            case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>                return 256 * sizeof(float[4]);
>>>            case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/r600/r600_pipe.c
>>> b/src/gallium/drivers/r600/r600_pipe.c
>>> index 3962fee..c794530 100644
>>> --- a/src/gallium/drivers/r600/r600_pipe.c
>>> +++ b/src/gallium/drivers/r600/r600_pipe.c
>>> @@ -434,6 +434,8 @@ static int r600_get_shader_param(struct pipe_screen*
>>> pscreen, unsigned shader, e
>>>                  return 32;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>                  return shader == PIPE_SHADER_VERTEX ? 16 : 32;
>>> +       case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +               return shader == PIPE_SHADER_FRAGMENT ? 8 : 32;
>>>          case PIPE_SHADER_CAP_MAX_TEMPS:
>>>                  return 256; /* Max native temporaries. */
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>> diff --git a/src/gallium/drivers/radeonsi/si_pipe.c
>>> b/src/gallium/drivers/radeonsi/si_pipe.c
>>> index cba6d98..8397115 100644
>>> --- a/src/gallium/drivers/radeonsi/si_pipe.c
>>> +++ b/src/gallium/drivers/radeonsi/si_pipe.c
>>> @@ -364,6 +364,8 @@ static int si_get_shader_param(struct pipe_screen*
>>> pscreen, unsigned shader, enu
>>>                  return 32;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>                  return shader == PIPE_SHADER_VERTEX ?
>>> SI_NUM_VERTEX_BUFFERS : 32;
>>> +       case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +               return shader == PIPE_SHADER_FRAGMENT ? 8 : 32;
>>>          case PIPE_SHADER_CAP_MAX_TEMPS:
>>>                  return 256; /* Max native temporaries. */
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>> diff --git a/src/gallium/drivers/svga/svga_screen.c
>>> b/src/gallium/drivers/svga/svga_screen.c
>>> index 004b4b4..587eaad 100644
>>> --- a/src/gallium/drivers/svga/svga_screen.c
>>> +++ b/src/gallium/drivers/svga/svga_screen.c
>>> @@ -330,6 +330,8 @@ static int svga_get_shader_param(struct pipe_screen
>>> *screen, unsigned shader, en
>>>             return SVGA3D_MAX_NESTING_LEVEL;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>             return 10;
>>> +      case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +         return svgascreen->max_color_buffers;
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>             return 224 * sizeof(float[4]);
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> @@ -387,6 +389,8 @@ static int svga_get_shader_param(struct pipe_screen
>>> *screen, unsigned shader, en
>>>             return SVGA3D_MAX_NESTING_LEVEL;
>>>          case PIPE_SHADER_CAP_MAX_INPUTS:
>>>             return 16;
>>> +      case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +         return 10;
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>>             return 256 * sizeof(float[4]);
>>>          case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
>>> diff --git a/src/gallium/drivers/vc4/vc4_screen.c
>>> b/src/gallium/drivers/vc4/vc4_screen.c
>>> index a327c7f..4c0455d 100644
>>> --- a/src/gallium/drivers/vc4/vc4_screen.c
>>> +++ b/src/gallium/drivers/vc4/vc4_screen.c
>>> @@ -283,6 +283,8 @@ vc4_screen_get_shader_param(struct pipe_screen
>>> *pscreen, unsigned shader,
>>>                            return 8;
>>>                    else
>>>                            return 16;
>>> +       case PIPE_SHADER_CAP_MAX_OUTPUTS:
>>> +               return shader == PIPE_SHADER_FRAGMENT ? 1 : 8;
>>>            case PIPE_SHADER_CAP_MAX_TEMPS:
>>>                    return 256; /* GL_MAX_PROGRAM_TEMPORARIES_ARB */
>>>            case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
>>> diff --git a/src/gallium/include/pipe/p_defines.h
>>> b/src/gallium/include/pipe/p_defines.h
>>> index 93156b9..7bec55c 100644
>>> --- a/src/gallium/include/pipe/p_defines.h
>>> +++ b/src/gallium/include/pipe/p_defines.h
>>> @@ -613,6 +613,7 @@ enum pipe_shader_cap
>>>       PIPE_SHADER_CAP_MAX_TEX_INDIRECTIONS,
>>>       PIPE_SHADER_CAP_MAX_CONTROL_FLOW_DEPTH,
>>>       PIPE_SHADER_CAP_MAX_INPUTS,
>>> +   PIPE_SHADER_CAP_MAX_OUTPUTS,
>>>       PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE,
>>>       PIPE_SHADER_CAP_MAX_CONST_BUFFERS,
>>>       PIPE_SHADER_CAP_MAX_TEMPS,
>>> diff --git a/src/mesa/state_tracker/st_extensions.c
>>> b/src/mesa/state_tracker/st_extensions.c
>>> index 5dd8278..78bfe30 100644
>>> --- a/src/mesa/state_tracker/st_extensions.c
>>> +++ b/src/mesa/state_tracker/st_extensions.c
>>> @@ -192,6 +192,10 @@ void st_init_limits(struct pipe_screen *screen,
>>>          pc->MaxParameters      = pc->MaxNativeParameters      =
>>>             screen->get_shader_param(screen, sh,
>>>                       PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE) /
>>> sizeof(float[4]);
>>> +      pc->MaxInputComponents =
>>> +         screen->get_shader_param(screen, sh, PIPE_SHADER_CAP_MAX_INPUTS)
>>> * 4;
>>> +      pc->MaxOutputComponents =
>>> +         screen->get_shader_param(screen, sh,
>>> PIPE_SHADER_CAP_MAX_OUTPUTS) * 4;
>>>
>>>          pc->MaxUniformComponents = 4 * MIN2(pc->MaxNativeParameters,
>>> MAX_UNIFORMS);
>>>
>>> @@ -261,10 +265,6 @@ void st_init_limits(struct pipe_screen *screen,
>>>       c->MaxVarying = screen->get_shader_param(screen,
>>> PIPE_SHADER_FRAGMENT,
>>>                                                PIPE_SHADER_CAP_MAX_INPUTS);
>>>       c->MaxVarying = MIN2(c->MaxVarying, MAX_VARYING);
>>> -   c->Program[MESA_SHADER_FRAGMENT].MaxInputComponents = c->MaxVarying *
>>> 4;
>>> -   c->Program[MESA_SHADER_VERTEX].MaxOutputComponents = c->MaxVarying *
>>> 4;
>>> -   c->Program[MESA_SHADER_GEOMETRY].MaxInputComponents = c->MaxVarying *
>>> 4;
>>> -   c->Program[MESA_SHADER_GEOMETRY].MaxOutputComponents = c->MaxVarying *
>>> 4;
>>>       c->MaxGeometryOutputVertices = screen->get_param(screen,
>>> PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES);
>>>       c->MaxGeometryTotalOutputComponents = screen->get_param(screen,
>>> PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS);
>>>
>>>
>>
>> Otherwise looks good to me (I guess the max number includes everything, so
>> things like layer output from the gs which in some apis is considered a
>> system value?).
>
> I'd like not to include system values in this limit, only GENERIC,
> TEXCOORD, COLOR and FOG semantics. If you have any suggestion what
> some drivers should return for this limit instead, please let me know.

I think that's fine. llvmpipe/softpipe don't really have much of a 
distinction of system values (in most places), that is for outputs they 
both are kind of the same so it would probably want to include them (at 
least right now) but since it is mostly just useful to return something 
reasonable for gl queries as you said I guess it doesn't really matter 
all that much (I don't know right now if these would actually count 
against any gl limits).
I suspect however you'd probably hit some asserts if you'd try to use 
PIPE_MAX_SHADER_OUTPUTS + some system values with llvmpipe/softpipe, 
pretty sure PIPE_MAX_SHADER_OUTPUTS is 48 and not 32 so it could fit 32 
generics without having to worry about any additional special outputs.

Roland


>
> What is and isn't a system value depends on each driver. For r600 and
> later, these are considered system values for a shader that precedes a
> fragment shader:
> - POSITION
> - PSIZE
> - EDGEFLAG
> - LAYER
> - VIEWPORT_INDEX
> - CLIPDIST
> - CLIPVERTEX (lowered to CLIPDIST, so doesn't take any space)
> - CULLDIST
>
> For a shader that is before a shader that precedes a fragment shader,
> the semantics above have no meaning and outputs are just passed to the
> next shader like GENERIC, so it's kinda confusing. However, r600 and
> later chips use on-chip or off-chip memory for passing outputs between
> VS-GS, VS-TCS, TCS-TES, VS-TES, and TES-GS, so we can always allocate
> more memory when we need more outputs. I think the internal driver
> limit is close to 64 at the moment.
>
> The main purpose of this cap is to return reasonable values for
> certain glGet queries.
>
> Marek
>



More information about the mesa-dev mailing list