[Mesa-dev] [PATCH v2 00/30] Finishing arb_gpu_shader_fp64 support to the i965 scalar backend

Mon May 16 09:17:39 UTC 2016

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 16/05/16 11:04, Samuel Iglesias Gonsálvez wrote:
> 
> 
> On 13/05/16 11:56, Iago Toral wrote:
>> On Thu, 2016-05-12 at 13:35 +0200, Samuel Iglesias Gonsálvez 
>> wrote:
>>> Hi,
>>> 
>>> this version includes all the feedback received to v1 plus a
>>> few new patches (22-27) that deal with 64bit URB read/writes,
>>> which was missing in v1. Below is a list of patches that still
>>> need to get the Rb:
>>> 
>>> [PATCH v2 02/30] i965/fs: Fix propagation of copies with
>>> strided source. [PATCH v2 05/30] i965/fs: Simplify and fix
>>> register offset calculation [PATCH v2 06/30] i965/fs: Reindent
>>> register offset calculation of [PATCH v2 07/30] i965/fs: fix
>>> copy propagation of partially invalidated [PATCH v2 11/30]
>>> i965/fs: add shuffle_32bit_load_result_to_64bit_data [PATCH v2
>>> 14/30] i965/fs: fix pull constant load component selection for
>>> [PATCH v2 18/30] i965/fs: support doubles with SSBO loads
>>> [PATCH v2 19/30] i965/fs: add
>>> shuffle_64bit_data_for_32bit_write helper [PATCH v2 20/30]
>>> i965/fs: support doubles with ssbo stores [PATCH v2 21/30] 
>>> i965/fs: support doubles with shared variable stores [PATCH v2 
>>> 22/30] i965/vec4: handle doubles in type_size_vec4() [PATCH v2 
>>> 23/30] i965/fs: fix number of output components for doubles 
>>> [PATCH v2 24/30] i965/fs: fix nir_intrinsic_store_output for 
>>> doubles [PATCH v2 25/30] i965/tcs/scalar: fix load input for 
>>> doubles [PATCH v2 26/30] i965/tcs/scalar: fix store output for 
>>> doubles [PATCH v2 27/30] i965/tes/scalar: Fix load input for 
>>> doubles
> 
>> I've just sent a v3 for patches 19 and 21. The former gets rid
>> of the temporary like Curro suggested since in this case we
>> really don't want to do the shuffling in-place. The latter fixes
>> a related bug where we were doing in-place shuffling before a
>> write which we shouldn't.
> 
>> I think we have addressed all the other comments too, including 
>> moving the shuffling functions to brw_fs_nir.cpp. I also went
>> ahead and made the do_untyped_vector_read helper static to
>> brw_fs_nir.cpp (instead of a fs_visitor method) since Curro's
>> reasoning for the shuffling functions applies to this helper just
>> as much.
> 
>> All these changes have been merged in our 
>> i965-fp64-scalar-backend-part2-to-push branch for review / 
>> testing.
> 
>> I think that at this point we only need the thumbs-up for those
>> two v3 patches and see if Curro has more feedback since I believe
>> he did not have time to go through all the patches yet. If Curro
>> does not find anything major we should be able to land this
>> tomorrow.
> 
> 
> After addressing Curro's feedback (thanks Curro!), I pushed these 
> patches to master! \o/
> 
> We plan to work on a follow-up patch series to improve some code
> but, at least, we have arb_gpu_shader_fp64 support for i965's
> scalar backend in master, ready for next Mesa release :-)
> 
> I would like to say thank you to Connor, Jason, Curro, Kenneth, 
> Jordan, Topi, Eric, Rob... and many others (sorry if I missed you)
> for your help, reviews, suggestions, advices and patience for
> having this extension ready on time.
> 

Specially to Connor, who started the work.

Thanks,

Sam, on behalf of Igalia team

> Thanks again,
> 
> Sam
> 
>>> There is still some discussion on going about where to put the
>>>  shuffling functions but it does not make sense to postpone
>>> review of v2 because of that, so for now we kept them in
>>> brw_fs.cpp and if we finally agree to move them to
>>> brw_fs_nir.cpp we will do that before pushing.
>>> 
>>> We have not observed any piglit regressions in ILK, SNB, IVB, 
>>> HSW, BDW or SKL compared against master's ba3f0b6.
>>> 
>>> This series enables fp64 for gen8+ only and requires scalar
>>> GS, TCS and TES so these gens can do fp64 in these stages via
>>> the scalar backend, as the vec4 backend is not ready yet.
>>> Support to enable the scalar backend by default for all 3
>>> stages has already landed in master so we should be all set in
>>> this regard.
>>> 
>>> As usual, a branch with the series is available for testing 
>>> here: $ git clone -b i965-fp64-scalar-backend-part2-to-push 
>>> https://github.com/Igalia/mesa.git
>>> 
>>> All the new fp64 tests we wrote have also landed in piglit, 
>>> except for patch [0]. We have a branch available with that
>>> test included here:
>>> 
>>> $ git clone -b arb_gpu_shader_fp64 
>>> https://github.com/Igalia/piglit.git
>>> 
>>> Thanks,
>>> 
>>> Sam
>>> 
>>> [0] 
>>> https://lists.freedesktop.org/archives/piglit/2016-May/019761.html
>>>
>>>
>>>
>
>>> 
Francisco Jerez (5):
>>> i965/fs: Fix propagation of copies with strided source.
>>> i965/fs: Simplify and fix register offset calculation of 
>>> try_copy_propagate(). i965/fs: Reindent register offset 
>>> calculation of try_copy_propagate(). i965/fs: Stop using the 
>>> LOAD_PAYLOAD instruction in lower_simd_width. i965/fs: Fix and 
>>> document component().
>>> 
>>> Iago Toral Quiroga (25): i965/fs: fix subreg_offset overflow
>>> in byte_offset() i965/fs: Fix copy propagation of load payload
>>> for double operands i965/fs: disallow type change in
>>> copy-propagation if types have different sizes i965/fs: fix
>>> copy propagation of partially invalidated entries i965/fs: fix
>>> copy propagation from load payload i965/fs: fix copy/constant
>>> propagation regioning checks i965/fs: add
>>> shuffle_32bit_load_result_to_64bit_data helper i965/fs: Fix
>>> fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles i965/fs: fix
>>> pull constant load component selection for doubles i965/fs:
>>> support doubles with UBO loads i965/fs: Add 
>>> do_untyped_vector_read helper i965/fs: support double with
>>> shared variable loads i965/fs: support doubles with SSBO loads
>>> i965/fs: add shuffle_64bit_data_for_32bit_write helper i965/fs:
>>> support doubles with ssbo stores i965/fs: support doubles with
>>> shared variable stores i965/vec4: handle doubles in
>>> type_size_vec4() i965/fs: fix number of output components for
>>> doubles i965/fs: fix nir_intrinsic_store_output for doubles
>>> i965/tcs/scalar: fix load input for doubles i965/tcs/scalar:
>>> fix store output for doubles i965/tes/scalar: Fix load input
>>> for doubles i965: Enable ARB_gpu_shader_fp64 for gen8+ docs:
>>> Mark ARB_gpu_shader_fp64 as done for i965/gen8+ i965: Expose
>>> OpenGL 4.0 for gen8+
>>> 
>>> docs/GL3.txt                                       |   2 +- 
>>> src/mesa/drivers/dri/i965/brw_fs.cpp               | 173 
>>> ++++++-- src/mesa/drivers/dri/i965/brw_fs.h                 |
>>> 16 + .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 136 
>>> +++--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp           |
>>> 459 +++++++++++++++++----
>>> src/mesa/drivers/dri/i965/brw_ir_fs.h |  17 +-
>>> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp     |   9 +-
>>> src/mesa/drivers/dri/i965/intel_extensions.c       |   5 +- 
>>> src/mesa/drivers/dri/i965/intel_screen.c           |   2 +- 9 
>>> files changed, 621 insertions(+), 198 deletions(-)
>>> 
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJXOZAzAAoJEH/0ujLxfcNDuWQQAJ6MPjpo9eisT47M1lVKDcql
fsNwPhfFHFarOjali43AvrRY7loDDoUcqwFWS0BjHY+LBlXPx+oAup+bfki/mVsb
90AiHuaHhQGQxzM6N7/qwvEc78kp8P2v/tB5qDxkrU+5FFYt3+FR8NVqu27hpH9n
i/MMgn+zjl8GN9/+r1qCeUPi4dnW+oem2pNc61LTTdglHcl6OCcbrzRMESPRqmeO
3pXiGP5lkHty5Mf+g+9hEPzwR3p/8x7qjVSUlMf57GycZdc9+lOkT+gXASBmFEpD
AQ9IeGQEsRAye7rWFw2PTCZJgDzNBNTu0hnNvO2wirYQHD+eXbirsHAfsIGy7M3H
AzfObBGgIpzVgXAfdJUtJDv6Yd6y3oL4eY4jB3ykT1si3l42CNsC63NTkBHZQZLv
yId/LeLrQamG9tUA/Hpbe6u6phJ5MHzHcJSOCCVBH9U6ilRsIijVI67WPc6tHMQN
KX4PdoIGua2xY+HvQlY0SNDSODRWE4zOn6XQsSxvEnfrrDeoUAifPZVnpQYdMNH8
XBOyf3pZDiqF/enPL9+RlwxjhfA5TkiKwfqJMwpM4val55hIs0P7vme7TKKBmLyy
i9mXlE/EMYmqrfrG1yHjzlqrXdvp/WpcGkBHIGX3H/TcTKUBVXRhvTUrwHq71F0d
bwJLmYXANaL0bJU3xWOE
=/y7n
-----END PGP SIGNATURE-----