Mesa (master): 35 new commits

Thu Apr 18 14:15:58 UTC 2019

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c2b8fb9a810003791bfa65b3173ccc28bfe14484
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Jun 22 11:41:28 2018 +0200

    anv/device: expose VK_KHR_shader_float16_int8 in gen8+
    
    v2 (Jason):
     - Merge shaderFloat16 and shaderInt8 enablement into a single patch.
     - Merge extension enable.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=5a5d44b71307085f56cb8511fe154ebfdfd42831
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Jan 22 11:26:03 2019 +0100

    anv/pipeline: support Float16 and Int8 SPIR-V capabilities in gen8+
    
    v2:
      - Merge Float16 and Int8 capabilities into a single patch (Jason)
      - Merged patch that enabled SPIR-V front-end checks for these caps
        (except for Int8, which was already merged)
    
    v3:
     - Keep capabilities sorted (Jason)
    
    v4:
    - SpvCapabilityFloat16 support already added in master (Juan)
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6ee07a664951c11a27d0cfb38e6af218a7ef0a4
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Jan 22 11:27:09 2019 +0100

    compiler/spirv: move the check for Int8 capability
    
    So it is right after the checks for the other various Int* capabilities.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8ed6d74c922e967732bb6f8b4e39bdcedd46e544
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Wed Feb 6 09:13:22 2019 +0100

    intel/compiler: validate region restrictions for mixed float mode
    
    v2:
     - Adapted unit tests to make them consistent with the changes done
       to the validation of half-float conversions.
    
    v3 (Curro):
    - Check all the accummulators
    - Constify declarations
    - Do not check src1 type in single-source instructions.
    - Check for all instructions that read accumulator (either implicitly or
      explicitly)
    - Check restrictions in src1 too.
    - Merge conditional block
    - Add invalid test case.
    
    v4 (Curro):
    - Assert on 3-src instructions, as they are not validated.
    - Get rid of types_are_mixed_float(), as we know instruction is mixed
      float at that point.
    - Remove conditions from not verified case.
    - Fix brackets on conditional.
    
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=58d6417e591db3f440c4a1c06c9cfdfae2a06dfb
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Feb 8 09:20:56 2019 +0100

    intel/compiler: validate conversions between 64-bit and 8-bit types
    
    v2:
     - Add some tests with UB type too (Jason)
    
    v3:
     - consider implicit conversions from 2src instructions too (Curro).
    
    v4:
     - Do not check src1 type in single-source instructions (Curro).
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v2)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7376d57a9c6ae69bc47bbbfe5d3b1a0ed0639227
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Feb 1 11:41:33 2019 +0100

    intel/compiler: validate region restrictions for half-float conversions
    
    v2:
     - Consider implicit conversions in 2-src instructions too (Curro)
     - For restrictions that involve destination stride requirements
       only validate them for Align1, since Align16 always requires
       packed data.
     - Skip general rule for the dst/execution type size ratio for
       mixed float instructions on CHV and SKL+, these have their own
       set of rules that we'll be validated separately.
    
    v3 (Curro):
     - Do not check src1 type in single-source instructions.
     - Check restriction on src1.
     - Remove invalid test.
    
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6ff52f0628a1d3401a3a18eb576158e4de66d044
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Feb 5 13:50:09 2019 +0100

    intel/compiler: also set F execution type for mixed float mode in BDW
    
    The section 'Execution Data Types' of 3D Media GPGPU volume, which
    describes execution types, is exactly the same in BDW and SKL+.
    
    Also, this section states that there is a single execution type, so it
    makes sense that this is the wider of the two floating point types
    involved in mixed float mode, which is what we do for SKL+ and CHV.
    
    v2:
     - Make sure we also account for the destination type in mixed mode (Curro).
    
    Acked-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=100debc3c9ec6b9b61ed2891faed66d94136e547
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Thu Mar 14 10:35:58 2019 +0100

    intel/compiler: implement SIMD16 restrictions for mixed-float instructions
    
    v2: f32to16/f16to32 can use a :W destination (Curro)
    v3: check destination is packed (Curro).
    
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6d87c651c90c24d5b670e1d9ba91754846e3b34d
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Feb 12 09:34:10 2019 +0100

    intel/compiler: skip MAD algebraic optimization for half-float or mixed mode
    
    It is very likely that this optimzation is never useful and we'll probably
    just end up removing it, so let's not bother adding more cases to it for
    now.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=64b93292ac19f9a74005108575e25fe7e47eee82
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Feb 12 12:43:30 2019 +0100

    intel/compiler: remove inexact algebraic optimizations from the backend
    
    NIR already has these and correctly considers exact/inexact qualification,
    whereas the backend doesn't and can apply the optimizations where it
    shouldn't. This happened to be the case in a handful of Tomb Raider shaders,
    where NIR would skip the optimizations because of a precise qualification
    but the backend would then (incorrectly) apply them anyway.
    
    Besides this, considering that we are not emitting much math in the backend
    these days it is unlikely that these optimizations are useful in general. A
    shader-db run confirms that MAD and LRP optimizations, for example, were only
    being triggered in cases where NIR would skip them due to precise
    requirements, so in the near future we might want to remove more of these,
    but for now we just remove the ones that are not completely correct.
    
    Suggested-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ddd1706ab3710f25f51bbd6b14bb3af72ca046f6
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon Nov 19 13:08:07 2018 +0100

    intel/compiler: fix cmod propagation for non 32-bit types
    
    v2:
     - Do not propagate if the bit-size changes
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=66002eeebe838ce491467e13d2b545dd3eff2e09
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Nov 20 14:04:26 2018 +0100

    intel/compiler: add a brw_reg_type_is_integer helper
    
    v2:
     - Fixed typo: meant BRW_REGISTER_TYPE_UB instead BRW_REGISTER_TYPE_UV
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=44e1affaec3477d52c56cd2a10b20af48ae39854
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Oct 26 13:40:27 2018 +0200

    intel/compiler: implement is_zero, is_one, is_negative_one for 8-bit/16-bit
    
    There are no 8-bit immediates, so assert in that case.
    16-bit immediates are replicated in each word of a 32-bit immediate, so
    we only need to check the lower 16-bits.
    
    v2:
     - Fix is_zero with half-float to consider -0 as well (Jason).
     - Fix is_negative_one for word type.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e64be391dd065b6a0eabee17ada038db7a28c112
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Jan 29 10:58:49 2019 +0100

    intel/compiler: generalize the combine constants pass
    
    At the very least we need it to handle HF too, since we are doing
    constant propagation for MAD and LRP, which relies on this pass
    to promote the immediates to GRF in the end, but ideally
    we want it to support even more types so we can take advantage
    of it to improve register pressure in some scenarios.
    
    v2 (Jason):
     - Support 64-bit types too.
     - Check if we need to set the half-float flag if the immediate already
       existed.
     - Multiply the size of the immediate by the width of the copy
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fb990bd76eb02425d1982d682716ebe766b536b8
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Wed Nov 7 12:08:02 2018 +0100

    intel/eu: force stride of 2 on NULL register for Byte instructions
    
    The hardware only allows a stride of 1 on a Byte destination for raw
    byte MOV instructions. This is required even when the destination
    is the NULL register.
    
    Rather than making sure that we emit a proper NULL:B destination
    every time we need one, just fix it at emission time.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ce68a061de746eaa42410db0890f4378d9f4872e
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Jan 4 10:15:39 2019 +0100

    intel/compiler: ask for an integer type if requesting an 8-bit type
    
    v2:
      - Assign BRW_REGISTER_TYPE_B directly for 8-bit (Jason)
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=092b14777433bbcd6735b45379dbdbd403500340
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Jul 17 09:02:27 2018 +0200

    intel/compiler: rework conversion opcodes
    
    Now that we have the regioning lowering pass we can just put all of these
    opcodes together in a single block and we can just assert on the few cases
    of conversion instructions that are not supported in hardware and that should
    be lowered in brw_nir_lower_conversions.
    
    The only cases what we still handle separately are the conversions from float
    to half-float since the rounding variants would need to fallthrough and we
    are already doing this for boolean opcodes (since they need to negate), plus
    there is also a large comment about these opcodes that we probably want to
    keep so it is just easier to keep these separate.
    
    Suggested-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=472244b374a953fc3a8953a722fdab746aef0676
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Jul 13 10:03:14 2018 +0200

    intel/compiler: activate 16-bit bit-size lowerings also for 8-bit
    
    Particularly, we need the same lowewrings we use for 16-bit
    integers.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=40b3abb4d16af4cef0307e1b4904c2ec0924299e
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Jul 10 09:52:46 2018 +0200

    intel/compiler: split is_partial_write() into two variants
    
    This function is used in two different scenarios that for 32-bit
    instructions are the same, but for 16-bit instructions are not.
    
    One scenario is that in which we are working at a SIMD8 register
    level and we need to know if a register is fully defined or written.
    This is useful, for example, in the context of liveness analysis or
    register allocation, where we work with units of registers.
    
    The other scenario is that in which we want to know if an instruction
    is writing a full scalar component or just some subset of it. This is
    useful, for example, in the context of some optimization passes
    like copy propagation.
    
    For 32-bit instructions (or larger), a SIMD8 dispatch will always write
    at least a full SIMD8 register (32B) if the write is not partial. The
    function is_partial_write() checks this to determine if we have a partial
    write. However, when we deal with 16-bit instructions, that logic disables
    some optimizations that should be safe. For example, a SIMD8 16-bit MOV will
    only update half of a SIMD register, but it is still a complete write of the
    variable for a SIMD8 dispatch, so we should not prevent copy propagation in
    this scenario because we don't write all 32 bytes in the SIMD register
    or because the write starts at offset 16B (wehere we pack components Y or
    W of 16-bit vectors).
    
    This is a problem for SIMD8 executions (VS, TCS, TES, GS) of 16-bit
    instructions, which lose a number of optimizations because of this, most
    important of which is copy-propagation.
    
    This patch splits is_partial_write() into is_partial_reg_write(), which
    represents the current is_partial_write(), useful for things like
    liveness analysis, and is_partial_var_write(), which considers
    the dispatch size to check if we are writing a full variable (rather
    than a full register) to decide if the write is partial or not, which
    is what we really want in many optimization passes.
    
    Then the patch goes on and rewrites all uses of is_partial_write() to use
    one or the other version. Specifically, we use is_partial_var_write()
    in the following places: copy propagation, cmod propagation, common
    subexpression elimination, saturate propagation and sel peephole.
    
    Notice that the semantics of is_partial_var_write() exactly match the
    current implementation of is_partial_write() for anything that is
    32-bit or larger, so no changes are expected for 32-bit instructions.
    
    Tested against ~5000 tests involving 16-bit instructions in CTS produced
    the following changes in instruction counts:
    
                Patched  |     Master    |    %    |
    ================================================
    SIMD8  |    621,900  |    706,721    | -12.00% |
    ================================================
    SIMD16 |     93,252  |     93,252    |   0.00% |
    ================================================
    
    As expected, the change only affects SIMD8 dispatches.
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0986199b31ab2a6086131887e474bc8f79fbc28d
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon Jan 21 12:11:44 2019 +0100

    intel/compiler: workaround for SIMD8 half-float MAD in gen8
    
    Empirical testing shows that gen8 has a bug where MAD instructions with
    a half-float source starting at a non-zero offset fail to execute
    properly.
    
    This scenario usually happened in SIMD8 executions, where we used to
    pack vector components Y and W in the second half of SIMD registers
    (therefore, with a 16B offset). It looks like we are not currently doing
    this any more but this would handle the situation properly if we ever
    happen to produce code like this again.
    
    v2 (Jason):
     - Move this workaround to the lower_regioning pass as an additional case
       to has_invalid_src_region()
     - Do not apply the workaround if the stride of the source operand is 0,
       testing suggests the problem doesn't exist in that case.
    
    v3 (Jason):
     - We want offset % REG_SIZE > 0, not just offset > 0
     - Use a helper to compute the offset
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=aaae24179ff1007776d2f3a5a813f2c52dc83eba
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Wed May 30 12:14:14 2018 +0200

    intel/compiler: fix ddy for half-float in Broadwell
    
    Broadwell has restrictions that apply to Align16 half-float that
    make the Align16 implementation of this invalid for this platform.
    Use the gen11 path for this instead, which uses Align1 mode.
    
    The restriction is not present in cherryview, gen9 or gen10, where
    the Align16 implementation seems to work just fine.
    
    v2:
     - Rework the comment in the code, move the PRM citation from the
       commit message to the comment in the code (Matt)
     - Cherryview isn't affected, only Broadwell (Matt)
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=60c7c6d3ba4ab41eec7f48d6266321e10e2e50df
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon May 28 12:32:08 2018 +0200

    intel/compiler: fix ddx and ddy for 16-bit float
    
    We were assuming 32-bit elements. Also, In SIMD8 we pack 2 vector components
    in a single SIMD register, so for example, component Y of a 16-bit vec2
    starts is at byte offset 16B. This means that when we compute the offset of
    the elements to be differentiated we should not stomp whatever base offset we
    have, but instead add to it.
    
    v2
     - Use byte_offset() helper (Jason)
     - Merge the fix for SIMD8: using byte_offset() fixes that too.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8f40d392b9f67235064b0fb4d894097e361f1d7c
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue May 22 08:17:38 2018 +0200

    intel/compiler: set correct precision fields for 3-source float instructions
    
    Source0 and Destination extract the floating-point precision automatically
    from the SrcType and DstType instruction fields respectively when they are
    set to types :F or :HF. For Source1 and Source2 operands, we use the new
    1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1
    means half-precision. Since we always use the type of the destination for
    all operands when we emit 3-source instructions, we only need set Src1Type
    and Src2Type to 1 when we are emitting a half-precision instruction.
    
    v2:
     - Set the bit separately for each source based on its type so we can
       do mixed floating-point mode in the future (Topi).
    
    v3:
     - Use regular citation style for the comment referencing the PRM (Matt).
     - Decided not to add asserts in the emission code to check that only
       mixed HF/F types are used since such checks would break negative tests
       for brw_eu_validate.c (Matt)
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e6b7410187dcd21de907c69273cc3d9a0b04dad5
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue May 22 08:17:17 2018 +0200

    intel/compiler: allow half-float on 3-source instructions since gen8
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ee049f6b717ea6e20cef38f16a8024276b181d17
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue May 22 10:21:29 2018 +0200

    intel/compiler: don't compact 3-src instructions with Src1Type or Src2Type bits
    
    We are now using these bits, so don't assert that they are not set. In gen8,
    if these bits are set compaction is not possible. On gen9 and CHV platforms
    set_3src_control_index() checks these bits (and others) against a table to
    validate if the particular bit combination is eligible for compaction or not.
    
    v2
     - Add more detail in the commit message explaining the situation for SKL+
       and CHV (Jason)
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=120c970619cd876a256f788afe2a79a92f8cd7ab
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon May 21 14:42:42 2018 +0200

    intel/compiler: add new half-float register type for 3-src instructions
    
    This is available since gen8.
    
    v2: restore previously existing assertion.
    
    v3: don't use separate tables for gen7 and gen8, just assert that we
        don't use half-float before gen8 (Matt)
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=4ab2b97a8fbc8fb07534ec92c9c5326889af290f
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon May 21 14:34:01 2018 +0200

    intel/compiler: add instruction setters for Src1Type and Src2Type.
    
    The original SrcType is a 3-bit field that takes a subset of the types
    supported for the hardware for 3-source instructions. Since gen8,
    when the half-float type was added, 3-source floating point operations
    can use use mixed precision mode, where not all the operands have the
    same floating-point precision. While the precision for the first operand
    is taken from the type in SrcType, the bits in Src1Type (bit 36) and
    Src2Type (bit 35) define the precision for the other operands
    (0: normal precision, 1: half precision).
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
    Reviewed-by: Matt Turner <mattst88 at gmail.com>
    Acked-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a8d8b1a1391b207d4b19b1ba864612837f1fd543
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon Jan 21 09:47:59 2019 +0100

    intel/compiler: drop unnecessary temporary from 32-bit fsign implementation
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=19cd2f5debd2ce10b269b0b015c30ecc3065497b
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon Jan 21 09:47:15 2019 +0100

    intel/compiler: implement 16-bit fsign
    
    v2:
     - make 16-bit be its own separate case (Jason)
    
    v3:
     - Drop the result_int temporary (Jason)
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=4588f4a6048af2ae1b3a2eb33fd23227c1edf593
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Thu Apr 26 10:26:22 2018 +0200

    intel/compiler: handle extended math restrictions for half-float
    
    Extended math with half-float operands is only supported since gen9,
    but it is limited to SIMD8. In gen8 we lower it to 32-bit.
    
    v2: quashed together the following patches (Jason):
      - intel/compiler: allow extended math functions with HF operands
      - intel/compiler: lower 16-bit extended math to 32-bit prior to gen9
      - intel/compiler: extended Math is limited to SIMD8 on half-float
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
      (allow extended math functions with HF operands,
       extended Math is limited to SIMD8 on half-float)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=114f4e6c29315286d362f339138c2c33d28b7878
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Thu Apr 26 10:12:12 2018 +0200

    intel/compiler: lower some 16-bit float operations to 32-bit
    
    The hardware doesn't support half-float for these.
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b6a454791b45b60b9518b4b8fb41fd443b3ceab1
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Tue Dec 18 09:27:21 2018 +0100

    intel/compiler: assert restrictions on conversions to half-float
    
    There are some hardware restrictions that brw_nir_lower_conversions should
    have taken care of before we get here.
    
    v2:
     - rebased on top of regioning lowering pass
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=66806405afa02024813869d8cc972f293041fa50
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Thu Nov 22 10:59:59 2018 +0100

    intel/compiler: handle b2i/b2f with other integer conversion opcodes
    
    Since we handle booleans as integers this makes more sense.
    
    v2:
     - rebased to incorporate new boolean conversion opcodes
    
    v3:
     - rebased on top regioning lowering pass
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net> (v1)
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v2)

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=92f4761198d24fb73cbe5bcd12b0ebf5bb766b4d
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Mar 2 13:37:59 2018 +0100

    intel/compiler: split float to 64-bit opcodes from int to 64-bit
    
    Going forward having these split is a bit more convenient since these two
    groups have different restrictions.
    
    v2:
     - Rebased on top of new regioning lowering pass.
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3e377c68f879be05059c3c8871ffc4ea752523f2
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Mon Dec 17 09:17:06 2018 +0100

    intel/compiler: add a NIR pass to lower conversions
    
    Some conversions are not directly supported in hardware and need to be
    split in two conversion instructions going through an intermediary type.
    Doing this at the NIR level simplifies a bit the complexity in the backend.
    
    v2:
     - Consider fp16 rounding conversion opcodes
     - Properly handle swizzles on conversion sources.
    
    v3
     - Run the pass earlier, right after nir_opt_algebraic_late (Jason)
     - NIR alu output types already have the bit-size (Jason)
     - Use 'is_conversion' to identify conversion operations (Jason)
    
    v4:
     - Be careful about the intermediate types we use so we don't lose
       range and avoid incorrect rounding semantics (Jason)
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com> (v1)
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>