[Mesa-dev] [PATCH v3] i965/fs: force pull model for 64-bit GS inputs
Kenneth Graunke
kenneth at whitecape.org
Fri Sep 29 04:19:17 UTC 2017
On Thursday, September 28, 2017 1:24:21 AM PDT Iago Toral Quiroga wrote:
> Triggering the push model when 64-bit inputs are involved is not easy due to
> the constrains on the maximum number of registers that we allow for this mode,
> however, for GS with 'points' primitive type and just a couple of double
> varyings we can trigger this and it just doesn't work because the
> implementation is not 64-bit aware at all. For now, let's make sure that we
> don't attempt this model whith 64-bit inputs and we always fall back to pull
> model for them.
>
> Also, don't enable the VUE handles in the thread payload on the fly when we
> find an input for which we need the pull model, this is not safe: if we need
> to resort to the pull model we need to account for that when we setup the
> thread payload so we compute the first non-payload register properly. If we
> didn't do that correctly and we enable it on-the-fly here then we will end up
> VUE handles on the first non-payload register which will probably lead to
> GPU hangs. Instead, always enable the VUE handles for the pull model so we
> can safely use them when needed. The GS is going to resort to pull model
> almost in every situation anyway, so this shouldn't make a significant
> difference and it makes things easier and safer.
>
> v2: Always enable the VUE handles for pull model, this is easier and safer
> and the GS is going to fallback to pull model almost always anyway (Ken)
>
> v3: Only clamp the URB read length if we are over the maximum reserved for
> push inputs as we were doing in the original code (Ken).
> ---
> src/intel/compiler/brw_fs.cpp | 16 +++++++++++-----
> src/intel/compiler/brw_fs_nir.cpp | 4 +++-
> 2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index eb9b4c3890..b13b56cfba 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -5602,6 +5602,17 @@ fs_visitor::setup_gs_payload()
> payload.num_regs++;
> }
>
> + /* Always enable VUE handles so we can safely use pull model if needed.
> + *
> + * The push model for a GS uses a ton of register space even for trivial
> + * scenarios with just a few inputs, so just make things easier and a bit
> + * safer by always having pull model available.
> + */
> + gs_prog_data->base.include_vue_handles = true;
> +
> + /* R3..RN: ICP Handles for each incoming vertex (when using pull model) */
> + payload.num_regs += nir->info.gs.vertices_in;
> +
> /* Use a maximum of 24 registers for push-model inputs. */
> const unsigned max_push_components = 24;
>
> @@ -5613,11 +5624,6 @@ fs_visitor::setup_gs_payload()
> */
> if (8 * vue_prog_data->urb_read_length * nir->info.gs.vertices_in >
> max_push_components || gs_prog_data->invocations > 1) {
One last change...sorry! It doesn't make much sense to clamp to the
maximum number of push components just because invocations > 1. We only
needed to consider that in order to include the VUE handles, which we
now do unconditionally.
Let's change this to:
if (8 * vue_prog_data->urb_read_length * nir->info.gs.vertices_in >
max_push_components) {
With that change,
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
Looking at this more closely has really shown all the ways I made a mess
of this...so, thanks again for cleaning this all up!
> - gs_prog_data->base.include_vue_handles = true;
> -
> - /* R3..RN: ICP Handles for each incoming vertex (when using pull model) */
> - payload.num_regs += nir->info.gs.vertices_in;
> -
> vue_prog_data->urb_read_length =
> ROUND_DOWN_TO(max_push_components / nir->info.gs.vertices_in, 8) / 8;
> }
> diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp
> index d760946e62..aa57bb9d78 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -1915,7 +1915,9 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
> const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8;
>
> /* TODO: figure out push input layout for invocations == 1 */
> + /* TODO: make this work with 64-bit inputs */
> if (gs_prog_data->invocations == 1 &&
> + type_sz(dst.type) <= 4 &&
> offset_const != NULL && vertex_const != NULL &&
> 4 * (base_offset + offset_const->u32[0]) < push_reg_count) {
> int imm_offset = (base_offset + offset_const->u32[0]) * 4 +
> @@ -1928,7 +1930,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst,
> }
>
> /* Resort to the pull model. Ensure the VUE handles are provided. */
> - gs_prog_data->base.include_vue_handles = true;
> + assert(gs_prog_data->base.include_vue_handles);
>
> unsigned first_icp_handle = gs_prog_data->include_primitive_id ? 3 : 2;
> fs_reg icp_handle = bld.vgrf(BRW_REGISTER_TYPE_UD, 1);
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170928/6f6d144f/attachment-0001.sig>
More information about the mesa-dev
mailing list