[Mesa-dev] [PATCH v2 00/30] Finishing arb_gpu_shader_fp64 support to the i965 scalar backend
Iago Toral
itoral at igalia.com
Fri May 13 09:56:28 UTC 2016
On Thu, 2016-05-12 at 13:35 +0200, Samuel Iglesias Gonsálvez wrote:
> Hi,
>
> this version includes all the feedback received to v1 plus a few new
> patches (22-27) that deal with 64bit URB read/writes, which was
> missing in v1. Below is a list of patches that still need to get the Rb:
>
> [PATCH v2 02/30] i965/fs: Fix propagation of copies with strided source.
> [PATCH v2 05/30] i965/fs: Simplify and fix register offset calculation
> [PATCH v2 06/30] i965/fs: Reindent register offset calculation of
> [PATCH v2 07/30] i965/fs: fix copy propagation of partially invalidated
> [PATCH v2 11/30] i965/fs: add shuffle_32bit_load_result_to_64bit_data
> [PATCH v2 14/30] i965/fs: fix pull constant load component selection for
> [PATCH v2 18/30] i965/fs: support doubles with SSBO loads
> [PATCH v2 19/30] i965/fs: add shuffle_64bit_data_for_32bit_write helper
> [PATCH v2 20/30] i965/fs: support doubles with ssbo stores
> [PATCH v2 21/30] i965/fs: support doubles with shared variable stores
> [PATCH v2 22/30] i965/vec4: handle doubles in type_size_vec4()
> [PATCH v2 23/30] i965/fs: fix number of output components for doubles
> [PATCH v2 24/30] i965/fs: fix nir_intrinsic_store_output for doubles
> [PATCH v2 25/30] i965/tcs/scalar: fix load input for doubles
> [PATCH v2 26/30] i965/tcs/scalar: fix store output for doubles
> [PATCH v2 27/30] i965/tes/scalar: Fix load input for doubles
I've just sent a v3 for patches 19 and 21. The former gets rid of the
temporary like Curro suggested since in this case we really don't want
to do the shuffling in-place. The latter fixes a related bug where we
were doing in-place shuffling before a write which we shouldn't.
I think we have addressed all the other comments too, including moving
the shuffling functions to brw_fs_nir.cpp. I also went ahead and made
the do_untyped_vector_read helper static to brw_fs_nir.cpp (instead of a
fs_visitor method) since Curro's reasoning for the shuffling functions
applies to this helper just as much.
All these changes have been merged in our
i965-fp64-scalar-backend-part2-to-push branch for review / testing.
I think that at this point we only need the thumbs-up for those two v3
patches and see if Curro has more feedback since I believe he did not
have time to go through all the patches yet. If Curro does not find
anything major we should be able to land this tomorrow.
> There is still some discussion on going about where to put the
> shuffling functions but it does not make sense to postpone review of v2
> because of that, so for now we kept them in brw_fs.cpp and if we
> finally agree to move them to brw_fs_nir.cpp we will do that before
> pushing.
>
> We have not observed any piglit regressions in ILK, SNB, IVB, HSW, BDW
> or SKL compared against master's ba3f0b6.
>
> This series enables fp64 for gen8+ only and requires scalar GS, TCS and
> TES so these gens can do fp64 in these stages via the scalar backend,
> as the vec4 backend is not ready yet. Support to enable the scalar
> backend by default for all 3 stages has already landed in master so we
> should be all set in this regard.
>
> As usual, a branch with the series is available for testing here:
> $ git clone -b i965-fp64-scalar-backend-part2-to-push https://github.com/Igalia/mesa.git
>
> All the new fp64 tests we wrote have also landed in piglit, except for
> patch [0]. We have a branch available with that test included here:
>
> $ git clone -b arb_gpu_shader_fp64 https://github.com/Igalia/piglit.git
>
> Thanks,
>
> Sam
>
> [0] https://lists.freedesktop.org/archives/piglit/2016-May/019761.html
>
> Francisco Jerez (5):
> i965/fs: Fix propagation of copies with strided source.
> i965/fs: Simplify and fix register offset calculation of
> try_copy_propagate().
> i965/fs: Reindent register offset calculation of try_copy_propagate().
> i965/fs: Stop using the LOAD_PAYLOAD instruction in lower_simd_width.
> i965/fs: Fix and document component().
>
> Iago Toral Quiroga (25):
> i965/fs: fix subreg_offset overflow in byte_offset()
> i965/fs: Fix copy propagation of load payload for double operands
> i965/fs: disallow type change in copy-propagation if types have
> different sizes
> i965/fs: fix copy propagation of partially invalidated entries
> i965/fs: fix copy propagation from load payload
> i965/fs: fix copy/constant propagation regioning checks
> i965/fs: add shuffle_32bit_load_result_to_64bit_data helper
> i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles
> i965/fs: fix pull constant load component selection for doubles
> i965/fs: support doubles with UBO loads
> i965/fs: Add do_untyped_vector_read helper
> i965/fs: support double with shared variable loads
> i965/fs: support doubles with SSBO loads
> i965/fs: add shuffle_64bit_data_for_32bit_write helper
> i965/fs: support doubles with ssbo stores
> i965/fs: support doubles with shared variable stores
> i965/vec4: handle doubles in type_size_vec4()
> i965/fs: fix number of output components for doubles
> i965/fs: fix nir_intrinsic_store_output for doubles
> i965/tcs/scalar: fix load input for doubles
> i965/tcs/scalar: fix store output for doubles
> i965/tes/scalar: Fix load input for doubles
> i965: Enable ARB_gpu_shader_fp64 for gen8+
> docs: Mark ARB_gpu_shader_fp64 as done for i965/gen8+
> i965: Expose OpenGL 4.0 for gen8+
>
> docs/GL3.txt | 2 +-
> src/mesa/drivers/dri/i965/brw_fs.cpp | 173 ++++++--
> src/mesa/drivers/dri/i965/brw_fs.h | 16 +
> .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 136 +++---
> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 459 +++++++++++++++++----
> src/mesa/drivers/dri/i965/brw_ir_fs.h | 17 +-
> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 9 +-
> src/mesa/drivers/dri/i965/intel_extensions.c | 5 +-
> src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
> 9 files changed, 621 insertions(+), 198 deletions(-)
>
More information about the mesa-dev
mailing list