[Mesa-dev] [PATCH 1/2] i965/fs: Allow spilling for SIMD16 compute shaders
Matt Turner
mattst88 at gmail.com
Fri Mar 4 21:39:44 UTC 2016
On Thu, Feb 25, 2016 at 11:50 PM, Jordan Justen
<jordan.l.justen at intel.com> wrote:
> For fragment shaders, we can always use a SIMD8 program. Therefore, if
> we detect spilling with a SIMD16 program, then it is better to skip
> generating a SIMD16 program to only rely on a SIMD8 program.
>
> Unfortunately, this doesn't work for compute shaders. For a compute
> shader, we may be required to use SIMD16 if the local workgroup size
> is bigger than a certain size. For example, on gen7, if the local
> workgroup size is larger than 512, then a SIMD16 program is required.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93840
> Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> Cc: "11.2" <mesa-stable at lists.freedesktop.org>
> ---
> src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
> src/mesa/drivers/dri/i965/brw_fs.h | 1 +
> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++++++++++
> 3 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index b506040..3f063a9 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -5228,7 +5228,7 @@ fs_visitor::allocate_registers()
> * SIMD8. There's probably actually some intermediate point where
> * SIMD16 with a couple of spills is still better.
> */
> - if (dispatch_width == 16) {
> + if (dispatch_width == 16 && min_dispatch_width <= 8) {
> fail("Failure to register allocate. Reduce number of "
> "live scalar values to avoid this.");
> } else {
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h
> index 7446ca1..43d8a9d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -407,6 +407,7 @@ public:
> bool spilled_any_registers;
>
> const unsigned dispatch_width; /**< 8 or 16 */
> + unsigned min_dispatch_width;
>
> int shader_time_index;
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 88b1896..753d97f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1021,6 +1021,17 @@ fs_visitor::init()
> unreachable("unhandled shader stage");
> }
>
> + if (stage == MESA_SHADER_COMPUTE) {
> + const brw_cs_prog_data *cs_prog_data =
> + (const brw_cs_prog_data*) prog_data;
Space before *
> + unsigned size = cs_prog_data->local_size[0] *
> + cs_prog_data->local_size[1] * cs_prog_data->local_size[2];
I'd probably write this as
unsigned size = cs_prog_data->local_size[0] *
cs_prog_data->local_size[1] *
cs_prog_data->local_size[2];
where cs_prog_data is aligned. I don't care too much.
> + size = DIV_ROUND_UP(size, devinfo->max_cs_threads);
> + min_dispatch_width = size > 16 ? 32 : (size > 8 ? 16 : 8);
O_o
> + } else {
> + min_dispatch_width = 8;
> + }
> +
Reviewed-by: Matt Turner <mattst88 at gmail.com>
More information about the mesa-dev
mailing list