[Mesa-dev] [PATCH 00/23] Finishing arb_gpu_shader_fp64 support to the i965 scalar backend

Wed May 4 04:16:26 UTC 2016

Jordan Justen <jordan.l.justen at intel.com> writes:

> On 2016-05-03 05:21:49, Samuel Iglesias Gonsálvez wrote:
>> Hello,
>> 
>> This series adds the final bits to support arb_gpu_shader_fp64 in the
>> i965 scalar backend for BDW+ hardware. It sits on top of the previous
>> series we sent last week [0] and which is going through review at the
>> moment. Specifically, this series adds:
>> 
>> 1. Fixes to copy propagation required for the pass to do the right
>>    thing in various cases specific to fp64 code as well as fixes to
>>    other more generic bugs in the pass that we detected.
>> 
>> 2. Support for double ubo/ssbo/shared-variable load/store.
>> 
>> The series also includes a fix to the execmasking issue in the SIMD
>> lowering pass that we discussed in the previous series (patch 10). As
>> we explained in that series, Curro is working on a better solution but
>> we decided to include it here so the problem is made explicit to
>> reviewers and also for testing, since that is necessary to fix some
>> problems. The plan is to go with Curro's patch when it is available and
>> we can test it.
>> 
>
> Is your proposal to upstream this version, and then convert to using
> curro's once it is upstream?
>
> Or, do you plan to try to pull curro's patch into this series before
> pushing this series?
>
My plan was to send my version of the fix as part of a larger and
somewhat intrusive series which would have solved the main drawback of
both workarounds: They introduce a bunch of partial writes prior
optimization.  The series (you can find it here [1]) seems to have a
substantial impact in the compilation throughput and shader-db results
for SIMD32 programs but I'm not done analyzing the performance issues of
SIMD32 compilation so I don't feel confident enough of the approach yet
to propose the whole series for inclusion.

That said, because my version of the fix is very similar to Iago's from
the functional point of view but it's a lot simpler [diffstat balance
-22 instead of +30, love when you can fix stuff by removing code :)], I
suggest we just go with my fix from the start, it should serve as
drop-in replacement for PATCH 10, I'll send a rebased version
independent from the rest of the series as reply to PATCH 10.

[1] https://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-late-simd-lowering

> -Jordan
>
>> The series does not introduce any regressions in piglit on ILK, SNB,
>> HSW, BDW or SKL compared with master's HEAD 5541e11b9.
>> 
>> Piglit's fp64 test totals with this series, as is, look like this:
>> 
>>        pass:                 2548
>>        fail:                   43
>>       crash:                  425
>>        skip:                   16
>>       total:                 3032
>> 
>> The crashes come from the lack of vec4 support required by the GS and
>> TESS stages at the moment. The plan for this is to use scalar GS and
>> TESS in gen8+ so these stages run through the scalar backend as well.
>> This means that we need to make a decision to either use these by
>> default in gen8+ or detect if we are using fp64 to enable them
>> selectively. If we enable scalar GS, TES backends (INTEL_SCALAR_TES=1
>> INTEL_SCALAR_GS=1), we obtain the following results:
>> 
>>        pass:                 2971
>>        fail:                   43
>>       crash:                    2
>>        skip:                   16
>>       total:                 3032
>> 
>> The fails and crashes observed are related to the register spilling
>> issues we mentioned in the previous series (we still need to check if
>> Curro's branch helps with that), so basically the tests fail to
>> compile. The 2 crashes happen when we try to compile a couple of GS
>> tests with the scalar backend and upon failure we attempt a vec4
>> compilation. Since we don't support vec4 fp64 yet, that ends up hitting
>> an assert. We think that we probably want to detect if GS/TESS programs
>> use fp64 and if so avoid falling back to vec4 if the scalar compilation
>> failed rather than attempting a backend that just does not implement
>> the feature.
>> 
>> Notice that scalar TCS support is not in master yet but the patches
>> have already been sent for review [2] and reviewed [3].
>> 
>> A branch with this series is available for testing here:
>> 
>> $ git clone -b i965-fp64-scalar-backend-part-2 https://github.com/Igalia/mesa.git
>> 
>> Thanks,
>> 
>> Sam
>> 
>> [0] https://lists.freedesktop.org/archives/mesa-dev/2016-April/115014.html
>> [1] https://lists.freedesktop.org/archives/mesa-dev/2016-April/115216.html
>> [2] https://lists.freedesktop.org/archives/mesa-dev/2016-April/114191.html
>> [3] https://lists.freedesktop.org/archives/mesa-dev/2016-April/114968.html
>> 
>> 
>> Iago Toral Quiroga (23):
>>   i965/fs: fix subreg_offset overflow in byte_offset()
>>   i965/fs: fix copy propagation from sources with stride 0
>>   i965/fs: Fix copy propagation of load payload for double operands
>>   i965/fs: fix requirements to allow type change in copy-propagation
>>   i965/fs: fix copy-propagation with suboffset from constants
>>   i965/fs: fix copy/constant propagation regioning checks
>>   i965/fs: fix copy propagation of partially invalidated entries
>>   i965/fs: fix copy propagation from load payload
>>   i965/fs: don't copy propagate from a larger type if the stride is not
>>     1
>>   i965/fs/lower_simd_width: fix result transposition
>>   i965/fs: Fix fs_visitor::VARYING_PULL_CONSTANT_LOAD for doubles
>>   i965/fs: fix pull constant load component selection for doubles
>>   i965/fs: support doubles with UBO loads
>>   i965/fs: support doubles with SSBO loads
>>   i965/fs: add SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA helper
>>   i965/fs: Add do_untyped_vector_read helper
>>   i965/fs: support doubles with shared variable loads
>>   i965/fs: add SHUFFLE_32BIT_DATA_FOR_64BIT_WRITE helper
>>   i965/fs: support doubles with ssbo stores
>>   i965/fs: support doubles with shared variable stores
>>   i965: Enable ARB_gpu_shader_fp64 for gen8+
>>   docs: Mark ARB_gpu_shader_fp64 as done for i965/gen8+
>>   i965: Expose OpenGL 4.0 for gen8+
>> 
>>  docs/GL3.txt                                       |   2 +-
>>  src/mesa/drivers/dri/i965/brw_fs.cpp               | 156 ++++++++++++++-
>>  src/mesa/drivers/dri/i965/brw_fs.h                 |  16 ++
>>  .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 113 ++++++++---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp           | 217 +++++++++++++++++----
>>  src/mesa/drivers/dri/i965/brw_ir_fs.h              |   4 +
>>  src/mesa/drivers/dri/i965/intel_extensions.c       |   5 +-
>>  src/mesa/drivers/dri/i965/intel_screen.c           |   2 +-
>>  8 files changed, 445 insertions(+), 70 deletions(-)
>> 
>> -- 
>> 2.5.0
>> 
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160503/833db4cf/attachment.sig>