Mesa (bug-109980): 23 new commits

Wed Mar 13 18:13:37 UTC 2019

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=6002bb601a683febd718e8fae7d973f3687ae233
Author: Plamena Manolova <plamena.manolova at intel.com>
Date:   Tue Mar 12 21:25:36 2019 +0200

    i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9
    
    ARB_fragment_shader_interlock depends on memory fences to
    ensure fragment ordering and this ordering guarantee is
    only supported from GEN9 onwards.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980
    Fixes: 939312702e35 "i965: Add ARB_fragment_shader_interlock support."
    Signed-off-by: Plamena Manolova <plamena.n.manolova at gmail.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=97ad0efba08d336813366b9cab114c94c2ca61db
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Feb 22 20:53:41 2019 +0000

    iris: Use streaming loads to read from tiled surfaces
    
    Always use the streaming load (since we know we have Broadwell+, all of
    our target CPU support sse41) for reading back form the tiled surface
    for mapping the resource. This means we hit the fast WC handling paths
    on Atoms (without LLC), and for big Core (with LLC) using the streaming
    load is no less efficient as we do not require the tiled buffer to be
    pulled into the CPU cache.
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=797fb6c6ac96cb7d1d5f9a04dc4f22f350093a16
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Fri Feb 22 21:24:46 2019 +0000

    iris: Use coherent allocation for PIPE_RESOURCE_STAGING
    
    On !llc machines (Atoms), reading from a linear buffers is slow and so
    copying from one resource into the linear staging buffer is still slow.
    However, we can tell the GPU to snoop the CPU cache when reading from and
    writing to the staging buffer eliminating the slow uncached reads.
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=01b224047b0013380a5e8b709eaf2e3cd9976b39
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date:   Mon Feb 25 09:42:49 2019 +0000

    iris: Use PIPE_BUFFER_STAGING for the query objects
    
    We prefer fast CPU access to read back the query results.
    
    Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=65e8761474ca8c9c0cce167cb32b720c3cc25a90
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date:   Mon Mar 11 09:43:04 2019 -0700

    intel/nir: Combine store_derefs to improve code from SPIR-V
    
    Due to lack of write mask in SPIR-V store, generators may produce
    multiple stores to the same vector but using different array derefs.
    Use the combining store pass to clean this up.  For example,
    
        layout(binding = 3) buffer block {
            vec4 v;
        };
    
        void main() {
            v.x = 11;
            v.y = 22;
        }
    
    after going to SPIR-V and NIR, ends up with in two store_derefs to
    v[0] and v[1]
    
        vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block *)ssa_2)->field0 */
        vec2 32 ssa_6 = deref_array &(*ssa_4)[0] (ssbo float) /* &((block *)ssa_2)->field0[0] */
        intrinsic store_deref (ssa_6, ssa_7) (1, 0) /* wrmask=x */ /* access=0 */
        vec1 32 ssa_13 = load_const (0x00000001 /* 0.000000 */)
        vec2 32 ssa_14 = deref_array &(*ssa_4)[1] (ssbo float) /* &((block *)ssa_2)->field0[1] */
        intrinsic store_deref (ssa_14, ssa_15) (1, 0) /* wrmask=x */ /* access=0 */
    
    producing two different sends instructions in skl.  The combining pass
    transform the snippet above into
    
        vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block *)ssa_2)->field0 */
        vec4 32 ssa_18 = vec4 ssa_7, ssa_15, ssa_16, ssa_17
        intrinsic store_deref (ssa_4, ssa_18) (3, 0) /* wrmask=xy */ /* access=0 */
    
    producing a single sends instruction.
    
    v2: Move this from spirv_to_nir into the general optimization pass for
        intel compiler.  (Jason)
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=10dfb0011e7079e770184d252045c13c40e6b274
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date:   Fri Mar 8 11:50:47 2019 -0800

    intel/nir: Combine store_derefs after vectorizing IO
    
    Shader-db results for skl:
    
        total instructions in shared programs: 15232903 -> 15224781 (-0.05%)
        instructions in affected programs: 61246 -> 53124 (-13.26%)
        helped: 221
        HURT: 0
    
        total cycles in shared programs: 371440470 -> 371398018 (-0.01%)
        cycles in affected programs: 281363 -> 238911 (-15.09%)
        helped: 221
        HURT: 0
    
    Results for bdw are very similar.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=822a8865e4645ed7e1818568d1d0338b462c7748
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date:   Fri Mar 8 10:08:20 2019 -0800

    nir: Add a pass to combine store_derefs to same vector
    
    v2: (all from Jason)
        Reuse existing function for the end of the block combinations.
        Check the SSA values are coming from the right place in tests.
        Document the case when the store to array_deref is reused.
    
    Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=cbf022cb316f1224f9afcc12ca414fc2d7d778a8
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Wed Mar 13 14:04:14 2019 +0100

    ac: use the raw tbuffer version for 16-bit SSBO loads
    
    vindex is always 0.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=045fae0f734a39cd24e444ac05382545dc7fdd2e
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Wed Mar 13 14:04:13 2019 +0100

    ac: add ac_build_{struct,raw}_tbuffer_load() helpers
    
    The struct version sets IDXEN=1, while the raw version sets IDXEN=0.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=a66b186bebf9b63897199b9b6e26d40977417f74
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Tue Feb 26 13:42:28 2019 +0100

    radv: use typed buffer loads for vertex input fetches
    
    This drastically reduces the number of SGPRs because the driver
    now uses descriptors per vertex binding, instead of per vertex
    attribute format.
    
    29077 shaders in 15096 tests
    Totals:
    SGPRS: 1354285 -> 1282109 (-5.33 %)
    VGPRS: 909896 -> 908800 (-0.12 %)
    Spilled SGPRs: 24840 -> 24811 (-0.12 %)
    Code Size: 49221144 -> 48986628 (-0.48 %) bytes
    Max Waves: 243930 -> 244229 (0.12 %)
    
    Totals from affected shaders:
    SGPRS: 390648 -> 318472 (-18.48 %)
    VGPRS: 288432 -> 287336 (-0.38 %)
    Spilled SGPRs: 94 -> 65 (-30.85 %)
    Code Size: 11548412 -> 11313896 (-2.03 %) bytes
    Max Waves: 86460 -> 86759 (0.35 %)
    
    This gives a really tiny boost.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0b9a06a1a0e4f4b7130a5c372d13b586a8d66878
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Tue Feb 26 13:42:27 2019 +0100

    radv: store more vertex attribute infos as pipeline keys
    
    They are required for using typed buffer loads.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=489dac0d21baf069cf0045e785330eb1b16094a4
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Tue Feb 26 13:42:26 2019 +0100

    ac: rework typed buffers loads for LLVM 7
    
    Be more generic, this will be used by an upcoming series.
    
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=56e04f67f906aea6101ba6081c5b0efcc25999cc
Author: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Date:   Mon Mar 11 13:35:27 2019 +0100

    panfrost: Set bo->gem_handle when creating a linear BO
    
    So we can free it later.
    
    Signed-off-by: Tomeu Vizoso <tomeu.vizoso at collabora.com>
    Reviewed-by: Alyssa Rosenzweig <alyssa at rosenzweig.io>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=bfbad30543dd896459b09e0e05bc70ea1727e0b9
Author: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Date:   Mon Mar 11 13:34:53 2019 +0100

    panfrost: Set bo->size[0] in the DRM backend
    
    So we can unmap it later.
    
    Signed-off-by: Tomeu Vizoso <tomeu.vizoso at collabora.com>
    Reviewed-by: Alyssa Rosenzweig <alyssa at rosenzweig.io>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=3570d15b6d88bdcd353b31ffe5460d04a88b7b6f
Author: Kenneth Graunke <kenneth at whitecape.org>
Date:   Mon Mar 11 19:00:21 2019 -0700

    intel/fs: Fix opt_peephole_csel to not throw away saturates.
    
    We were not copying the saturate bit from the original instruction
    to the new replacement instruction.  This caused major misrendering
    in DiRT Rally on iris, where comparisons leading to discards failed
    due to the missing saturate, causing lots of extra garbage pixels to
    be drawn in text rendering, trees, and so on.
    
    This did not show up on i965 because st/nir performs a more aggressive
    version of nir_opt_peephole_select, yielding more b32csel operations.
    
    Fixes: 52c7df1643e i965/fs: Merge CMP and SEL into CSEL on Gen8+
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=bd17bdc56b34a08c421172df27fe07294c7a7024
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Mar 11 20:43:15 2019 -0500

    glsl/lower_vector_derefs: Don't use a temporary for TCS outputs
    
    Tessellation control shader outputs act as if they have memory backing
    them and you can have multiple writes to different components of the
    same vector in-flight at the same time.  When this happens, the load vec
    store pattern that gets used by ir_triop_vector_insert doesn't yield the
    correct results.  Instead, just emit a sequence of conditional
    assignments.
    
    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
    Cc: mesa-stable at lists.freedesktop.org

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=20c4578c5539de909e94a6acc3ad680ab2ddeca6
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Mon Mar 11 21:01:34 2019 -0500

    glsl/list: Add a list variant of insert_after
    
    Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
    Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=83fdefc06287f6c8bbb3bb5bb4ccd36d653017a3
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Mar 12 16:25:39 2019 -0500

    nir/loop_unroll: Fix out-of-bounds access handling
    
    The previous code was completely broken when it came to constructing the
    undef values.  I'm not sure how it ever worked.  For the case of a copy
    that reads an undefined value, we can just delete the copy because the
    destination is a valid undefined value.  This saves us the effort of
    trying to construct a value for an arbitrary copy_deref intrinsic.
    
    Fixes: e8a8937a04 "nir: add partial loop unrolling support"
    Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=c056609c439da964db8344a8fde66aec4bd9c877
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date:   Tue Mar 12 18:18:58 2019 -0500

    anv: Ignore VkRenderPassInputAttachementAspectCreateInfo
    
    We don't care about the information but there's no sense in throwing a
    debug warning about it.  It's harmless but annoying to users.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984
    Reviewed-by: Sagar Ghuge <sagar.ghuge at intel.com>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=486b181fd758c246c2d1eaa1975a858e84d64c32
Author: Eric Anholt <eric at anholt.net>
Date:   Tue Mar 12 14:59:21 2019 -0700

    v3d: Fix leak of the renderonly struct on screen destruction.
    
    This makes v3d match vc4's destroy path.
    
    Fixes: e113b21cb779 ("v3d: Add renderonly support.")

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=0c874c18cd07539f56fede272a24b76f2946716f
Author: Eric Anholt <eric at anholt.net>
Date:   Tue Mar 12 14:56:57 2019 -0700

    v3d: Fix leak of the mem_ctx after the DAG refactor.
    
    Noticed while trying to get a CTS run again.
    
    Fixes: 33886474d646 ("v3d: Use the DAG datastructure for QPU instruction scheduling.")

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=acfd88204e886e671da97b895fd2d1ee39b61256
Author: Grigori Goronzy <greg at chown.ath.cx>
Date:   Thu Aug 3 20:07:58 2017 +0200

    glx: add support for GLX_ARB_create_context_no_error (v3)
    
    v2: Only reject no-error contexts for too-old GL if we're actually
    trying to create a no-error context (Adam Jackson)
    v3: Fix share contexts (Adam Jackson)
    
    Reviewed-by: Adam Jackson <ajax at redhat.com>
    Reviewed-by: Eric Anholt <eric at anholt.net>

URL:    http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae77f1236862e73c1ac250898924c648d481bda4
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date:   Tue Mar 12 21:49:42 2019 +0100

    radv: set the maximum number of IBs per submit to 192
    
    This fixes random SteamVR corruption, see
    https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181
    
    Fixes: 4d30f2c6f42 ("radv/winsys: remove the max IBs per submit limit for the fallback path")
    Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
    Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>