Mesa (master): 51 new commits

Tue Nov 7 18:44:03 UTC 2017

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d002950e5491f971cbaa77ac80a698e5d746295a
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Nov 2 15:59:58 2017 -0700

    intel/fs/nir: Return Q types from brw_reg_type_for_bit_size
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=dee58ecd2e3b23d1a3d2cdffb99d3dd314421b39
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Nov 2 18:32:39 2017 -0700

    intel/fs/nir: Use Q immediates for load_const on gen8+
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=9bb34892bf99a6f2285f792519f51cefe5c219ee
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Nov 2 18:30:04 2017 -0700

    intel/fs/nir: Setup immediates based on type in i2b and f2b
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1cb210f4bc412a9c1fef12e05ea9d9fe8995f4d5
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Nov 2 18:29:03 2017 -0700

    intel/reg: Add helpers for 64-bit integer immediates
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=df81b81fb91f45e6da0c504ee672d45829c41d06
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Aug 23 17:43:36 2017 -0700

    compiler/nir_types: Handle vectors in glsl_get_array_element
    
    Most of NIR doesn't allow doing array indexing on a vector (though it
    does on a matrix).  However, nir_lower_io handles it just fine and this
    behavior is needed for shared variables in Vulkan.  This commit makes
    glsl_get_array_element do something sensible for vector types and makes
    nir_validate happy with them.
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ad77775809555bc215f468424215e8dddc7083bf
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 16:40:28 2017 -0700

    nir: Validate base types on array dereferences
    
    We were already validating that the parent type goes along with the
    child type but we weren't actually validating that the parent type is
    reasonable.  This fixes that.
    
    Acked-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ab9220edd69fcb7016e15d4d96186eac524b45a4
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 18:57:56 2017 -0700

    nir,intel/compiler: Use a fixed subgroup size
    
    The GL_ARB_shader_ballot spec says that gl_SubGroupSizeARB is declared
    as a uniform.  This means that it cannot change across an invocation
    such as a draw call or a compute dispatch.  For compute shaders, we're
    ok because we only ever use one dispatch size.  For fragment, however,
    the hardware dynamically chooses between SIMD8 and SIMD16 which violates
    the spec.  Instead, let's just pick a subgroup size based on the shader
    stage.  The fixed size we choose for compute shaders is a bit higher
    than strictly needed but there's no real harm in that.  The advantage is
    that, if they do anything interesting with the value, NIR will see it as
    an immediate and can optimize better.
    
    Acked-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a026458020e947cc5d864cfb5b19660836b2d613
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 18:44:51 2017 -0700

    nir/lower_subgroups: Lower ballot intrinsics to the specified bit size
    
    Ballot intrinsics return a bitfield of subgroups.  In GLSL and some
    SPIR-V extensions, they return a uint64_t.  In SPV_KHR_shader_ballot,
    they return a uvec4.  Also, some back-ends would rather pass around
    32-bit values because it's easier than messing with 64-bit all the time.
    To solve this mess, we make nir_lower_subgroups take a new parameter
    called ballot_bit_size and it lowers whichever thing it gets in from the
    source language (uint64_t or uvec4) to a scalar with the specified
    number of bits.  This replaces a chunk of the old lowering code.
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8c2bf020fd649957597d074cf2390d6de029ddd0
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 31 14:42:33 2017 -0700

    nir/builder: Add a nir_imm_intN_t helper
    
    This lets you easily build integer immediates of arbitrary bit size.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=9b35faba426acc111741aa69752cb87185e548aa
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 14:09:37 2017 -0700

    nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
    
    The SUBGROUP_*_MASK system values are uint64_t when coming in from GLSL
    but uvec4 when coming in from SPIR-V.  Lowering based on type allows us
    to nicely handle both.
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3ee91ee6ac739f7ad4d5d4b066073efbeb511b41
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 19:58:59 2017 -0700

    nir: Make ballot intrinsics variable-size
    
    This way they can return either a uvec4 or a uint64_t.  At the moment,
    this is a no-op since we still always return a uint64_t.
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ad127afcfd5e7d6fb98e7cf2ae02333249d31fb2
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 14:08:32 2017 -0700

    nir: Add a ssa_dest_init_for_type helper
    
    This would be useful a number of places
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=28da82f9783091cb3e79586962f98a5bc165cff8
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Aug 22 13:23:59 2017 -0700

    nir: Add a new subgroups lowering pass
    
    This commit pulls nir_lower_read_invocations_to_scalar along with most
    of the guts of nir_opt_intrinsics (which mostly does subgroup lowering)
    into a new nir_lower_subgroups pass.  There are various other bits of
    subgroup lowering that we're going to want to do so it makes a bit more
    sense to keep it all together in one pass.  We also move it in i965 to
    happen after nir_lower_system_values to ensure that because we want to
    handle the subgroup mask system value intrinsics here.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1ca3a9442760b6f7ffcc624bdc527fc7dbc70825
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Aug 31 09:53:02 2017 -0700

    intel/fs: Don't use automatic exec size inference
    
    The automatic exec size inference can accidentally mess things up if
    we're not careful.  For instance, if we have
    
    add(4)    g38.2<4>D    g38.1<8,2,4>D    g38.2<8,2,4>D
    
    then the destination register will end up having a width of 2 with a
    horizontal stride of 4 and a vertical stride of 8.  The EU emit code
    sees the width of 2 and decides that we really wanted an exec size of 2
    which doesn't do what we wanted.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=dc4cf11dfc6c0fa7a3e086f16afba0a369fe320a
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Aug 30 12:07:00 2017 -0700

    intel/fs: Explicitly set EXECUTE_1 where needed
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ab378734f5016d00875fc87ec1cfa96d6eccf879
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Aug 30 13:36:58 2017 -0700

    intel/eu: Explicitly set EXECUTE_1 where needed
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=8280560705d95dbbb059c20adddfd220d7efe593
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Aug 31 09:41:22 2017 -0700

    intel/eu: Make automatic exec sizes a configurable option
    
    We have had a feature in codegen for some time that tries to
    automatically infer the execution size of an instruction from the width
    of its destination.  For things such as fixed function GS, clipper, and
    SF programs, this is very useful because they tend to have lots of
    hand-rolled register setup and trying to specify the exec size all the
    time would be prohibitive.  For things that come from a higher-level IR,
    however, it's easier to just set the right size all the time and the
    automatic exec sizes can, in fact, cause problems.  This commit makes it
    optional while enabling it by default.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=7a82ad54bb56cafaeea7f909cd9fc35542c23ba0
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 09:59:34 2017 -0700

    intel/fs: Rework zero-length URB write handling
    
    Originally we tried to handle this case based on slots_valid.  However,
    there are a number of ways that this can go wrong.  For one, we throw
    away any trailing slots which either aren't written or are set to
    VARYING_SLOT_PAD.  Second, even if PSIZ is a valid slot, we may not
    actually write anything there.  Between the lot of these, it was
    possible to end up in a case where we tried to do a regular URB write
    but ended up with a length of 1 which is invalid.  This commit moves it
    to the end and makes it based on a new boolean flag urb_written.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6132992cdb858268af0e985727d80e4140be389c
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Aug 31 21:56:43 2017 -0700

    intel/compiler/fs: Set up subgroup invocation as a system value
    
    Subgroup invocation is computed using a vector immediate and some
    dispatch-aware arithmetic.  Unfortunately, due to the vector arithmetic,
    and the fact that it's frequently read 16-wide, it's not something that
    can easily be CSEd by the back-end compiler.  There are a few different
    possible approaches to this problem:
    
     1) Emit the code to calculate the subgroup invocation on-the-fly and
        trust NIR to do the CSE.  This is what we were doing.
    
     2) Add a back-end instruction for the subgroup ID.  This has the
        advantage of helping the back-end compiler with CSE but has the
        downside of very poor scheduling for the calculation because it has
        to be emitted in the back-end.
    
     3) Emit the calculation at the top of the program and re-use the
        result.  This gets rid of the CSE problem but comes at the cost of
        an extra live register.
    
    This commit switches us from 1) to 3).  We choose to store the subgroup
    invocation values as a W type to reduce the impact of the extra live
    register.  Trusting NIR and using 1) was fine but we're soon going to
    want to use the subgroup invocation value for other things in the
    back-end compiler and this makes it much easier to do without having to
    worry about CSE problems.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=295605c930270a5b90f847b79474507d8b0c9e9c
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Aug 24 11:40:31 2017 -0700

    intel/cs: Push subgroup ID instead of base thread ID
    
    We're going to want subgroup ID for SPIR-V subgroups eventually anyway.
    We really only want to push one and calculate the other from it.  It
    makes a bit more sense to push the subgroup ID because it's simpler to
    calculate and because it's a real API thing.  The only advantage to
    pushing the base thread ID is to avoid a single SHL in the shader.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6411defdcd6f560e74eaaaf3266f9efbb6dd81da
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 21:27:19 2017 -0700

    intel/cs: Re-run final NIR optimizations for each SIMD size
    
    With the advent of SPIR-V subgroup operations, compute shaders will have
    to be slightly different depending on the SIMD size at which they
    execute.  In order to allow us to do dispatch-width specific things in
    NIR, we re-run the final NIR stages for each sIMD width.
    
    One side-effect of this change is that we start rallocing fs_visitors
    which means we need DECLARE_RALLOC_CXX_OPERATORS.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=4e79a77cdc65af621f77c685b01cd18ace187965
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Nov 6 17:01:56 2017 -0800

    intel/compiler: Move the destructor from vec4_visitor to backend_shader
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=16ada419d7c13bc96e299d3b17d756ec1af6f38a
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Nov 6 16:29:42 2017 -0800

    i965/fs: Get rid of the early return in brw_compile_cs
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=80ddfab2f54d7cd9dd4b93d2fbfa239f061a1f2b
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 29 17:57:32 2017 -0700

    intel/cs: Rework the way thread local ID is handled
    
    Previously, brw_nir_lower_intrinsics added the param and then emitted a
    load_uniform intrinsic to load it directly.  This commit switches things
    over to use a specific NIR intrinsic for the thread id.  The one thing I
    don't like about this approach is that we have to copy thread_local_id
    over to the new visitor in import_uniforms.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=25f7453c9e6dc7c947b936bdac86680c332362bf
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Oct 2 20:25:11 2017 -0700

    intel/fs: Mark 64-bit values as being contiguous
    
    This isn't often a problem , when we're in a compute shader, we must
    push the thread local ID so we decrement the amount of available push
    space by 1 and it's no longer even and 64-bit data can, in theory, span
    it.  By marking those uniforms contiguous, we ensure that they never get
    split in half between push and pull constants.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c4c8cba7059bdfa6f9e3924d2b5d9e2633147e58
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 21:27:42 2017 -0700

    intel/cs: Ignore runtime_check_aads_emit for CS
    
    It's only set on gen4-5 which clearly don't support compute shaders.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d4de813d86c38121649f11306080ebda193236a0
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 20:00:30 2017 -0700

    intel/cs: Stop setting dispatch_grf_start_reg
    
    Nothing ever reads it for compute shaders because it's always 1.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b1a9cdede4b500560ba6b44761296f09b4a7558f
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 19:30:24 2017 -0700

    intel/cs: Drop max_dispatch_width checks from compile_cs
    
    The only things that adjust fs_visitor::max_dispatch_width are render
    target writes which don't happen in compute shaders so they're
    pointless.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1077981eb56f63b595c3bd74ab8af2e11af2a8eb
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 19:16:45 2017 -0700

    intel/fs: Remove min_dispatch_width from fs_visitor
    
    It's 8 for everything except compute shaders.  For compute shaders,
    there's no need to duplicate the computation and it's just a possible
    source of error.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b299ded02eccfa94aede65086bd1ad254aaa5180
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Aug 21 18:42:41 2017 -0700

    intel/fs: use pull constant locations to check for first compile of a shader
    
    Before, we bailing in assign_constant_locations based on the minimum
    dispatch size.  The more direct thing to do is simply to check for
    whether or not we have constant locations and bail if we do.  For
    nir_setup_uniforms, it's completely safe to do it multiple times because
    we just copy a value from the NIR shader.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=103081c9a9912a11b47077b8e25efdbbb3d65e10
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 22:37:42 2017 -0700

    intel/fs: Retype dest to match value in read[First]Invocation
    
    This is what we really wanted all along.  Always retyping to D works
    because that's what get_nir_src() always gives us, at least for 32-bit
    types.  The SPIR-V variants of these operations accept arbitrary types
    and we need this if we're going to handle 64 or 16-bit values.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ebaee9da4adaad10e1c46bdd2f5521175ea04044
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 22:35:43 2017 -0700

    intel/fs: Uniformize the index in readInvocation
    
    The index is any value provided by the shader and this can be called in
    non-uniform control flow so we can't just take component 0.  Found by
    inspection.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=b67230de635528bfc6d5e66b90f7406eb97eb1c0
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 22:30:53 2017 -0700

    intel/fs: Protect opt_algebraic from OOB BROADCAST indices
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa4ff4b98c93b6870fd3f4ae9dae8aae350b0476
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Aug 23 17:10:33 2017 -0700

    i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ec8c6649f17ab9790eaddf56a043abb3316aad48
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Sun Aug 27 21:48:03 2017 -0700

    i965/fs/nir: Minor refactor of store_output
    
    Stop retyping the output of shuffle_64bit_data_for_32bit_write.  It's
    always BRW_REGISTER_TYPE_D which is perfectly fine for writing out.
    Also, when we change get_nir_src to return something with a 64-bit type
    for 64-bit values, the retyping will not be at all what we want.  Also,
    retyping the output based on src.type before we whack it back to 32 bits
    is a problem because the output is always 32 bits.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=030d2b5016360caf44ebfa3f6951a6d676316a89
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Sat Aug 26 10:00:14 2017 -0700

    i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
    
    All callers of this function allocate a fs_reg expressly to pass into
    it.  It's much easier if we just let the helper allocate the register.
    While we're here, we switch it to doing the MOVs with an integer type so
    that we don't accidentally canonicalize floats on half of a double.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6197a6b7ac6ff03e87a939311329fa0cb4af7f4c
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Sat Aug 26 11:26:40 2017 -0700

    i965/fs/nir: Simplify 64-bit store_output
    
    The swizzles weren't doing any good because swiz is just XYZW.  Also, we
    were emitting an extra set of MOVs because shuffle_64bit_data_for_32bit
    already does a MOV for us.  Finally, the temporary was only ever used
    inside the inner loop so there's no need for it to actually be an array.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=18fde36ced4279f2577097a1a7d31b55f2f5f141
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 18:59:26 2017 -0700

    intel/fs: Use the original destination region for int MUL lowering
    
    Some hardware (CHV, BXT) have special restrictions on register regions
    when doing integer multiplication.  We want to respect those when we
    lower to DxW multiplication.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=d54f8ec744545673fd78f15ffce3cb4e47d4b5f1
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 18:56:29 2017 -0700

    intel/fs: Fix integer multiplication lowering for src/dst hazards
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fd1bcccc2de9ba6a1ad6171342a155091963c3b9
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 14:45:43 2017 -0700

    intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
    
    The same workaround we need for 64-bit values on little core also takes
    care of the Ivy Bridge problem and does so a bit more efficiently so we
    can drop that code while we're here.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6041a31e77680597614776e59edb12709ec2e019
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 14:45:12 2017 -0700

    intel/eu: Fix broadcast instruction for 64-bit values on little-core
    
    We're not using broadcast for any 32-bit types right now since we mostly
    use it for emit_uniformize on 32-bit buffer indices.  However, SPIR-V
    subgroups are going to need it for 64-bit so let's make it work.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=10e4feed39120072f38274b95e884422f72f360f
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 19:50:36 2017 -0700

    intel/eu/reg: Add a subscript() helper
    
    This is similar to the identically named fs_reg helper.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=068beb41d831919cb0fb82d01daaee1c57679803
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 14:16:31 2017 -0700

    intel/eu: Just modify the offset in brw_broadcast
    
    This means we have to drop const from a variable but it also means that
    100% of the code which deals with the offset limit is in one place.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e3bcc8613311dede33612b185b7e6e374c09570c
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Oct 17 11:57:48 2017 -0700

    intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST
    
    These restrictions effectively already existed due to the way we use
    indirect sources but weren't being directly enforced.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1b8ef49f48ae3634e4903422a9d9c11864c03cb1
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Oct 12 16:17:03 2017 -0700

    intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
    
    For some reason, the any/all predicates don't work properly with SIMD32.
    In particular, it appears that a SEL with a QtrCtrl of 2H doesn't read
    the correct subset of the flag register and you end up getting garbage
    in the second half.  Work around this by using a pair of 1-wide MOVs and
    scattering the result.  This fixes the any/all instructions for SIMD32.
    
    Reviewed-by: Matt Turner <mattst88 at gmail.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=1f416630079f38110910ba796f70e2b81e9ddbf4
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Sep 6 20:32:30 2017 -0700

    intel/fs: Use an explicit D type for vote any/all/eq intrinsics
    
    The any/all intrinsics return a boolean value so D or UD is the correct
    type.  Unfortunately, get_nir_dest has the annoying behavior of
    returnning a float type by default.  This causes format conversion which
    gives us -1.0f or 0.0f in the register.  If the consumer of the result
    does an integer comparison to zero, it will give you the right boolean
    value but if we do something more clever based on the 0/~0 assumption
    for booleans, this will give the wrong value.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6c00240bc650805e0b66aa6e17dbe69bbe41e446
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Sep 6 18:37:34 2017 -0700

    intel/fs: Don't stomp f0.1 in SIMD16 ballot
    
    In fragment shaders f0.1 is used for discards so doing ballot after a
    discard can potentially cause the discard to not happen.  However, we
    don't support SIMD32 fragment shaders yet so this isn't a problem.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=def013a863558a1f4735d82ef3dfa0f8261fa743
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Fri Sep 1 23:24:15 2017 -0700

    intel/fs: Use ANY/ALL32 predicates in SIMD32
    
    We have ANY/ALL32 predicates and, for the most part, they work just
    fine.  (See the next commit for more details.)  Also, due to the way
    that flag registers are handled in hardware, instruction splitting is
    able to split the CMP correctly.  Specifically, that hardware looks at
    the execution group and knows to shift it's flag usage up correctly so a
    2H instruction will write to f0.1 instead of f0.0.
    
    Reviewed-by: Matt Turner <mattst88 at gmail.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0d905597fe2997c89022c76cdf84dc4fba5eb055
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Sep 6 18:31:11 2017 -0700

    intel/fs: Be more explicit about our placement of [un]zip
    
    Before, we were careful to place the zip after the last of the split
    instructions but did unzip on-demand.  This changes things so that the
    unzips go before all of the split instructions and the unzip comes
    explicitly after all the split instructions.  As a side-effect of this
    change, we now emit the split instruction from highest SIMD group to
    lowest instead of low to high.  We could have kept the old behavior, but
    it shouldn't matter and this made the code easier.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=fcd4adb9d08094520fb8d118d3448b04c6ec1fd1
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Wed Sep 6 18:24:17 2017 -0700

    intel/fs: Pass builders instead of blocks into emit_[un]zip
    
    This makes it far more explicit where we're inserting the instructions
    rather than the magic "before and after" stuff that the emit_[un]zip
    helpers did based on block and inst.
    
    Reviewed-by: Iago Toral Quiroga <itoral at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=e8c9e65185de3e821e1e482e77906d1d51efa3ec
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Thu Nov 2 14:52:49 2017 -0700

    intel/fs: Use a pure vertical stride for large register strides
    
    Register strides higher than 4 are uncommon but they can happen.  For
    instance, if you have a 64-bit extract_u8 operation, we turn that into
    UB -> UQ MOV with a source stride of 8.  Our previous calculation would
    try to generate a stride of <32;8,8>:ub which is invalid because the
    maximum horizontal stride is 4.  To solve this problem, we instead use a
    stride of <8;1,0>.  As noted in the comment, this does not work as a
    destination but that's ok as very few things actually generate that
    stride.
    
    Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
    Cc: mesa-stable at lists.freedesktop.org