Mesa (master): 30 new commits

Francisco Jerez currojerez at kemper.freedesktop.org
Fri Apr 14 22:03:57 UTC 2017


URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8973ae3162aec112b22cdf58f47d0ee12c4a09cd
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Wed Apr 5 06:23:43 2017 +0200

    docs/relnotes: add GL_ARB_gpu_shader_fp64 support on i965/ivybridge
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Acked-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ef49dda2df94c8060047b845a3a027460c45ba7c
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Tue Oct 11 10:59:52 2016 +0200

    docs: mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as supported by i965/gen7+
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Acked-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a494afdb8e09640956743649354fbb7147231d1d
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Aug 26 07:39:04 2016 +0200

    i965: enable OpenGL 4.0 to Ivybridge/Baytrail
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=cd0a6b2fc2ef6e04ffb262072821113cb49cd530
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Aug 26 07:37:42 2016 +0200

    i965: enable ARB_gpu_shader_fp64 for Ivybridge/Baytrail
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=2eeb1b0ad9453ba135b72aaeec6c0d4dbf9ac87c
Author: Matt Turner <mattst88 at gmail.com>
Date:   Fri Jan 20 13:35:33 2017 -0800

    i965: Use correct VertStride on align16 instructions.
    
    In commit c35fa7a, we changed the "width" of DF source registers to 2,
    which is conceptually fine. Unfortunately a VertStride of 2 is not
    allowed by align16 instructions on IVB/BYT, and the regular VertStride
    of 4 works fine in any case.
    
    See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
    for example:
    
    cmp.ge.f0(8)    g18<1>DF        g1<0>.xyxyDF    -g8<2>DF        { align16 1Q };
            ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
    cmp.ge.f0(8)    g19<1>DF        g1<0>.xyxyDF    -g9<2>DF        { align16 2N };
            ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
    
    v2:
    - Add spec quote (Curro).
    - Change the condition to only BRW_VERTICAL_STRIDE_2 (Curro)
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d8441e2276912d353d4fc6c0cf6b781ab5153ee7
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Mar 17 11:57:25 2017 +0100

    i965/vec4/dce: improve track of partial flag register writes
    
    This is required for correctness in presence of multiple 4-wide flag
    writes (e.g. 4-wide instructions with a conditional mod set) which
    update a different portion of the same 8-bit flag subregister.
    
    Right now we keep track of flag dataflow with 8-bit granularity and
    consider flag writes to have killed any previous definition of the
    same subregister even if the write was less than 8 channels wide,
    which can cause live flag register updates to be dead
    code-eliminated incorrectly.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c1fc8fad47f60bda857fc45c4052c5f4effe0d84
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Mar 17 11:55:49 2017 +0100

    i965/vec4: don't do horizontal stride on some register file types
    
    horiz_offset() shouldn't be doing anything for scalar registers,
    because all channels of any SIMD instructions will end up reading or
    writing the same component of the register, so shifting the register
    offset would be wrong.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Re-implement in terms of is_uniform() for
      simplicity.  Pass argument by const reference.  Clarify commit
      message. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=21e8e3a8484241508ac2c250fc4367234fa337df
Author: Matt Turner <mattst88 at gmail.com>
Date:   Fri Jan 20 13:35:32 2017 -0800

    i965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT.
    
    Otherwise for a pack_double_2x32_split opcode, we emit:
    
       vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134
    mov(8)          g5<1>UD         g5<4>.xUD                       { align16 1Q compacted };
    mov(8)          g7<2>UD         g5<4,4,1>UD                     { align1 1Q };
            ERROR: When the destination spans two registers, the source must span two registers
                   (exceptions for scalar source and packed-word to packed-dword expansion)
    mov(8)          g8<2>UD         g5.4<4,4,1>UD                   { align1 2N };
            ERROR: The offset from the two source registers must be the same
    mov(8)          g5<1>UD         g6<4>.xUD                       { align16 1Q compacted };
    mov(8)          g7.1<2>UD       g5<4,4,1>UD                     { align1 1Q };
            ERROR: When the destination spans two registers, the source must span two registers
                   (exceptions for scalar source and packed-word to packed-dword expansion)
    mov(8)          g8.1<2>UD       g5.4<4,4,1>UD                   { align1 2N };
            ERROR: The offset from the two source registers must be the same
    
    The intention was to emit mov(4)s for the instructions that have ERROR
    annotations.
    
    See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test
    for example.
    
    v2 (Samuel):
    - Instead of setting the exec size to a fixed value, don't double it
    (Curro).
    - Add PICK_{HIGH,LOW}_32BIT to the condition.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Trivial rebase changes. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=f030aaf2fb558219a43f286e2ea71c928e49b598
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Tue Mar 7 10:29:53 2017 +0100

    i965/vec4: use vec4_builder to emit instructions in setup_imm_df()
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Drop useless vec4_visitor dependencies.  Demote to
      static stand-alone function.  Don't write unused components in the
      result.  Use vec4_builder interface for register allocation. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a907c91e93cce88ee1929263c455fab541b8c4a3
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Fri Sep 23 15:57:39 2016 +0000

    i965/vec4: consider subregister offset in live variables
    
    Take into account offset values less than a full register (32 bytes)
    when getting the var from register.
    
    This is required when dealing with an operation that writes half of the
    register (like one d2x in IVB/BYT, which uses exec_size == 4).
    
    v2:
    - Take in account this offset < 32 in liveness analysis too (Curro)
    
    v3:
    - Change formula in var_from_reg() (Curro)
    - Remove useless changes (Curro)
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=92649a3e6756465b3961cf05910cda93a69c7790
Author: Francisco Jerez <currojerez at riseup.net>
Date:   Wed Apr 12 16:54:49 2017 -0700

    i965/vec4: fix assert to detect SIMD lowered DF instructions in IVB
    
    On IVB, DF instructions have lowered the SIMD width to 4 but the
    exec_size will be later doubled. Fix the assert to avoid crashing in
    this case.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Simplify assert.  Except for the 'inst->group % 4
      == 0' part the assertion was redundant with the previous assertion. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6e3265eae533a1bff4f23a4508c5d8e9ab23164d
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Mar 24 08:46:13 2017 +0100

    i965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's type
    
    This way we can set the destination type as double to all these new opcodes,
    avoiding any optimizer's confusion that was happening before.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Drop no_spill workaround originally needed due to
      the bogus destination type of VEC4_OPCODE_FROM_DOUBLE. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=50a5217637636f066feabefd7fe46d0ff7778a64
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Wed Mar 8 09:27:49 2017 +0100

    i965/vec4: split d2x conversion and data gathering from one opcode to two explicit ones
    
    When doing a 64-bit to a smaller data type size conversion, the destination should
    be aligned to 64-bits. Because of that, we need to gather the data after the
    actual conversion.
    
    Until now, these two operations were done by VEC4_OPCODE_FROM_DOUBLE but
    now we split them explicitely in two different instructions:
    VEC4_OPCODE_FROM_DOUBLE just do the conversion and
    VEC4_OPCODE_PICK_LOW_32BIT will gather the data.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=cfaf14a12607a8e9fd3d86a0c0219c428401f68f
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Fri Sep 23 09:57:43 2016 +0000

    i965/vec4: fix VEC4_OPCODE_FROM_DOUBLE for IVB/BYT
    
    In the generator we must generate slightly different code for
    Ivybridge/Baytrail, because of the way the stride works in
    this hardware.
    
    v2:
    - Use stride and don't need to fix dst (Curro)
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=be445d3ea3a7b4575c2dbac3d702e27e9ec3f125
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Mon Sep 12 16:06:22 2016 +0000

    i965/vec4: keep original type when dealing with null registers
    
    Keep the original type when dealing with null registers. Especially
    because we do no want to introduce an implicit conversion between
    types that could affect the conditional flags.
    
    This affects especially when the original type is DF, and we are working
    on Ivybridge/Baytrail.
    
    v2 (Curro)
    - Fix typo.
    - Use retype() instead of applying the type directly.
    - Remove unneeded retype.
    
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a21dc2b500cff6e0aaf31867c5b42651306ddaf1
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Mon Aug 29 10:10:30 2016 +0200

    i965/vec4: split DF instructions and later double its execsize in IVB/BYT
    
    We need to split DF instructions in two on IVB/BYT as it needs an
    execsize 8 to process 4 DF values (one GRF in total).
    
    v2:
    - Rename helper and make it static inline function (Matt).
    - Fix indention and add braces (Matt).
    
    v3:
    - Don't edit IR instruction when doubling exec_size (Curro)
    - Add comment into the code (Curro).
    - Manage ARF registers like the others (Curro)
    
    v4:
    - Add get_exec_type() function and use it to calculate the execution
      size.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Fix bogus 'type != BAD_FILE' check.  Take
      destination type as execution type where there is no valid source.
      Assert-fail if the deduced execution type is byte.  Clarify comment
      in get_lowered_simd_width().  Move SIMD width workaround outside of
      'if (...inst->size_written > REG_SIZE)' conditional block, since the
      problem should be independent of whether the amount of data written
      by the instruction is greater or lower than a GRF.  Drop redundant
      is_ivb_df definition.  Drop bogus inst->exec_size < 8 check.
      Simplify channel group assertion. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a5399e8b1cc3e2e12b8aa067e8380d1b088c35ca
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Thu Aug 25 16:05:24 2016 +0200

    i965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT
    
    The hardware applies the same channel enable signals to both halves of
    the compressed instruction which will be just wrong under non-uniform
    control flow. Fix this by splitting those instructions to SIMD4.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ebfb703d443a4b22320c3f1eed34e6e1aa54e998
Author: Francisco Jerez <currojerez at riseup.net>
Date:   Thu Feb 9 10:16:58 2017 -0800

    i965/fs: Get 64-bit indirect moves working on IVB.
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=630b84cdc80594d912a64f64aa75ac498e6f1248
Author: Matt Turner <mattst88 at gmail.com>
Date:   Thu Jan 12 18:05:58 2017 -0800

    i965: Use source region <1,2,0> when converting to DF.
    
    Doing so allows us to use a single MOV in VEC4_OPCODE_TO_DOUBLE instead
    of two.
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3198ce3f96848856206e7b2e54a53024bcca7737
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Wed Aug 3 11:51:44 2016 +0000

    i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT
    
    According to the IVB and HSW PRMs:
    
    "2.When the destination requires two registers and the sources are
     indirect, the sources must use 1x1 regioning mode."
    
    So for DF instructions the execution size is not limited by the number
    of address registers that are available, but by the EU decompression
    logic not handling VxH indirect addressing correctly.
    
    This patch limits the SIMD width to 4 in this case.
    
    v2:
    - Fix typo (Matt).
    - Fix condition (Curro)
    
    v3:
    - Add spec quote (Curro)
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Signed-off-by: Juan A. Suarez Romero <jasuarez at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=571cbd05ebfb8bef22277c5758afc82f5dd6a3f2
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Fri Jan 20 08:50:50 2017 +0100

    i965/fs: fix dst stride in IVB/BYT type conversions
    
    When converting a DF to 32-bit conversions, we set dst stride to 2,
    to fulfill alignment restrictions because the upper Dword of every
    Qword will be written with undefined value.
    
    But in IVB/BYT, this is not necessary, as each DF conversion already
    writes 2, the first one the real value, and the second one a 0.
    That is, IVB/BYT already set stride = 2 implicitly, so we must set it to
    1 explicitly to avoid ending up with stride = 4.
    
    v2:
    - Fix typo (Matt)
    
    v3:
    - Fix stride in the destination's brw_reg, don't modity IR (Curro)
    
    v4:
    - Remove 'is_dst' argument of brw_reg_from_fs_reg() (Curro)
    - Fix comment (Curro).
    - Relax hstride assert (Curro)
    
    Signed-off-by: Juan A. Suarez Romero <jasuarez at igalia.com>
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Minor spelling fixes. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=af6fc3a8ea27368ba70338437e27e3c2b522b27b
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Tue Mar 14 08:17:36 2017 +0100

    i965/fs: rename lower_d2x to lower_conversions
    
    v2:
    - Change the name to lower_conversions.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=dee31311eb024a636466e359b43d3a67b0135f32
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Tue Mar 28 06:25:13 2017 +0200

    Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs."
    
    This reverts commit 7dccd38b400d3a65da20ddefe282a7bb0b7ccb58.
    
    d2x pass fixes SEL instructions when there is a type conversion
    by doing a SEL without type conversion and then convert the result.
    This pass also takes into account the non-uniform control flow.
    
    Then, 7dccd38b400d3a65da20ddefe282a7bb0b7ccb58 is not needed anymore.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>
    Reviewed-by: Matt Turner <mattst88 at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=aeecc82d057adf43189d08214b21ca5166ad9682
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Fri Jan 20 08:47:05 2017 +0100

    i965/fs: generalize the legalization d2x pass
    
    Generalize it to lower any unsupported narrower conversion.
    
    v2 (Curro):
    - Add supports_type_conversion()
    - Reuse existing intruction instead of cloning it.
    - Generalize d2x to narrower and equal size conversions.
    
    v3 (Curro):
    - Make supports_type_conversion() const and improve it.
    - Use foreach_block_and_inst to process added instructions.
    - Simplify code.
    - Add assert and improve comments.
    - Remove redundant mov.
    - Remove useless comment.
    - Remove saturate == false assert and add support for saturation
      when fixing the conversion.
    - Add get_exec_type() function.
    
    v4 (Curro):
    - Use get_exec_type() function to get sources' type.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=94ffeb7fa2257af3eb416a1e720e08911835665b
Author: Matt Turner <mattst88 at gmail.com>
Date:   Tue Jan 10 19:33:22 2017 -0800

    i965: Use <0,2,1> region for scalar DF sources on IVB/BYT.
    
    On HSW+, scalar DF sources can be accessed using the normal <0,1,0>
    region, but on IVB and BYT DF regions must be programmed in terms of
    floats. A <0,2,1> region accomplishes this.
    
    v2:
    - Apply region <0,2,1> in brw_reg_from_fs_reg() (Curro).
    
    v3:
    - Added comment explaining the reason (Curro).
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=82d17615f442555b3577be41e24edd341a11d01d
Author: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
Date:   Wed Jan 11 08:17:57 2017 +0100

    i965/fs: clamp exec_size when an instruction has a scalar DF source
    
    Then the SIMD lowering pass will get rid of any compressed instructions with scalar
    source (whether force_writemask_all or not) and we avoid hitting the Gen7 region
    decompression bug.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Suggested-by: Francisco Jerez <currojerez at riseup.net>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0f1316d4dbc19f46acd5a738df25e632d95f4105
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Mon Jul 18 07:27:56 2016 +0000

    i965/fs: double regioning parameters and execsize for DF in IVB/BYT
    
    In IVB and BYT, both regioning parameters and execution sizes are measured as
    32-bits element size.
    
    So when we have something like:
    
    mov(8) g2<1>DF g3<4,4,1>DF
    
    We are not actually moving 8 doubles (our intention), but 4 doubles.
    
    We need to double the parameters to cope with this issue. However,
    horizontal strides don't behave as they're supposed to on IVB
    for DF regions, they will cause each 32-bit half of DF sources to be
    strided individually, and doubling the value won't make any difference.
    
    v2:
    - Use devinfo directly (Matt).
    - Use Baytrail instead of Valleview (Matt).
    - Use IvyBridge instead of Ivy (Matt)
    - Double the exec_size in code emission (Curro)
    
    v3:
    - Change hstride doubling by an assert and fix commit log (Curro).
    - Substitute remaining compiler->devinfo by devinfo (Curro).
    
    v4:
    - Fix comment (Curro).
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=79af2563889098550de5d4a0955efbeb87a24565
Author: Juan A. Suarez Romero <jasuarez at igalia.com>
Date:   Mon Jul 18 07:17:39 2016 +0000

    i965/fs: add helper to retrieve instruction execution type
    
    The execution data size is the biggest type size of any instruction
    operand.
    
    We will use it to know if the instruction deals with DF, because in Ivy
    we need to double the execution size and regioning parameters.
    
    v2:
    - Fix typo in commit log (Matt)
    - Use static inline function instead of fs_inst's method (Curro).
    - Define the result as a constant (Curro).
    - Fix indentation (Matt).
    - Add braces to nested control flow (Matt).
    
    v3 (Curro):
    - Add get_exec_type() and other auxiliary functions and use them to
      calculate its size.
    
    Signed-off-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    [ Francisco Jerez: Fix bogus 'type != BAD_FILE' check.  Fix deduced
      execution type for integer vector types.  Take destination type as
      execution type where there is no valid source.  Assert-fail if the
      deduced execution type is byte.  Move into brw_ir_fs.h header for
      consistency with the VEC4 back-end. ]
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fd349d29e43fa9c2227cbf649282810782ecf555
Author: Matt Turner <mattst88 at gmail.com>
Date:   Fri Jan 20 13:35:31 2017 -0800

    i965: Handle IVB DF differences in the validator.
    
    On IVB/BYT, region parameters and execution size for DF are in terms of
    32-bit elements, so they are doubled. For evaluating the validity of an
    instruction, we halve them.
    
    v2 (Sam):
    - Add comments.
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fbac8b1f9465fd13dd61aa0010049d1b61654a2b
Author: Iago Toral Quiroga <itoral at igalia.com>
Date:   Fri Jul 22 13:36:25 2016 +0200

    i965/disasm: also print nibctrl in IVB for execsize=8
    
    4-wide DF operations where NibCtrl applies require and execsize of 8
    in IvyBridge/BayTrail.
    
    v2:
    - Refactor NibCtrl printing (Matt)
    
    Reviewed-by: Matt Turner <mattst88 at gmail.com>
    Reviewed-by: Francisco Jerez <currojerez at riseup.net>




More information about the mesa-commit mailing list