Mesa (bug-109980): 23 new commits
GitLab Mirror
gitlab-mirror at kemper.freedesktop.org
Wed Mar 13 18:13:37 UTC 2019
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=6002bb601a683febd718e8fae7d973f3687ae233
Author: Plamena Manolova <plamena.manolova at intel.com>
Date: Tue Mar 12 21:25:36 2019 +0200
i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9
ARB_fragment_shader_interlock depends on memory fences to
ensure fragment ordering and this ordering guarantee is
only supported from GEN9 onwards.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980
Fixes: 939312702e35 "i965: Add ARB_fragment_shader_interlock support."
Signed-off-by: Plamena Manolova <plamena.n.manolova at gmail.com>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=97ad0efba08d336813366b9cab114c94c2ca61db
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date: Fri Feb 22 20:53:41 2019 +0000
iris: Use streaming loads to read from tiled surfaces
Always use the streaming load (since we know we have Broadwell+, all of
our target CPU support sse41) for reading back form the tiled surface
for mapping the resource. This means we hit the fast WC handling paths
on Atoms (without LLC), and for big Core (with LLC) using the streaming
load is no less efficient as we do not require the tiled buffer to be
pulled into the CPU cache.
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=797fb6c6ac96cb7d1d5f9a04dc4f22f350093a16
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date: Fri Feb 22 21:24:46 2019 +0000
iris: Use coherent allocation for PIPE_RESOURCE_STAGING
On !llc machines (Atoms), reading from a linear buffers is slow and so
copying from one resource into the linear staging buffer is still slow.
However, we can tell the GPU to snoop the CPU cache when reading from and
writing to the staging buffer eliminating the slow uncached reads.
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=01b224047b0013380a5e8b709eaf2e3cd9976b39
Author: Chris Wilson <chris at chris-wilson.co.uk>
Date: Mon Feb 25 09:42:49 2019 +0000
iris: Use PIPE_BUFFER_STAGING for the query objects
We prefer fast CPU access to read back the query results.
Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=65e8761474ca8c9c0cce167cb32b720c3cc25a90
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date: Mon Mar 11 09:43:04 2019 -0700
intel/nir: Combine store_derefs to improve code from SPIR-V
Due to lack of write mask in SPIR-V store, generators may produce
multiple stores to the same vector but using different array derefs.
Use the combining store pass to clean this up. For example,
layout(binding = 3) buffer block {
vec4 v;
};
void main() {
v.x = 11;
v.y = 22;
}
after going to SPIR-V and NIR, ends up with in two store_derefs to
v[0] and v[1]
vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block *)ssa_2)->field0 */
vec2 32 ssa_6 = deref_array &(*ssa_4)[0] (ssbo float) /* &((block *)ssa_2)->field0[0] */
intrinsic store_deref (ssa_6, ssa_7) (1, 0) /* wrmask=x */ /* access=0 */
vec1 32 ssa_13 = load_const (0x00000001 /* 0.000000 */)
vec2 32 ssa_14 = deref_array &(*ssa_4)[1] (ssbo float) /* &((block *)ssa_2)->field0[1] */
intrinsic store_deref (ssa_14, ssa_15) (1, 0) /* wrmask=x */ /* access=0 */
producing two different sends instructions in skl. The combining pass
transform the snippet above into
vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block *)ssa_2)->field0 */
vec4 32 ssa_18 = vec4 ssa_7, ssa_15, ssa_16, ssa_17
intrinsic store_deref (ssa_4, ssa_18) (3, 0) /* wrmask=xy */ /* access=0 */
producing a single sends instruction.
v2: Move this from spirv_to_nir into the general optimization pass for
intel compiler. (Jason)
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=10dfb0011e7079e770184d252045c13c40e6b274
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date: Fri Mar 8 11:50:47 2019 -0800
intel/nir: Combine store_derefs after vectorizing IO
Shader-db results for skl:
total instructions in shared programs: 15232903 -> 15224781 (-0.05%)
instructions in affected programs: 61246 -> 53124 (-13.26%)
helped: 221
HURT: 0
total cycles in shared programs: 371440470 -> 371398018 (-0.01%)
cycles in affected programs: 281363 -> 238911 (-15.09%)
helped: 221
HURT: 0
Results for bdw are very similar.
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=822a8865e4645ed7e1818568d1d0338b462c7748
Author: Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
Date: Fri Mar 8 10:08:20 2019 -0800
nir: Add a pass to combine store_derefs to same vector
v2: (all from Jason)
Reuse existing function for the end of the block combinations.
Check the SSA values are coming from the right place in tests.
Document the case when the store to array_deref is reused.
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=cbf022cb316f1224f9afcc12ca414fc2d7d778a8
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Wed Mar 13 14:04:14 2019 +0100
ac: use the raw tbuffer version for 16-bit SSBO loads
vindex is always 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=045fae0f734a39cd24e444ac05382545dc7fdd2e
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Wed Mar 13 14:04:13 2019 +0100
ac: add ac_build_{struct,raw}_tbuffer_load() helpers
The struct version sets IDXEN=1, while the raw version sets IDXEN=0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=a66b186bebf9b63897199b9b6e26d40977417f74
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Tue Feb 26 13:42:28 2019 +0100
radv: use typed buffer loads for vertex input fetches
This drastically reduces the number of SGPRs because the driver
now uses descriptors per vertex binding, instead of per vertex
attribute format.
29077 shaders in 15096 tests
Totals:
SGPRS: 1354285 -> 1282109 (-5.33 %)
VGPRS: 909896 -> 908800 (-0.12 %)
Spilled SGPRs: 24840 -> 24811 (-0.12 %)
Code Size: 49221144 -> 48986628 (-0.48 %) bytes
Max Waves: 243930 -> 244229 (0.12 %)
Totals from affected shaders:
SGPRS: 390648 -> 318472 (-18.48 %)
VGPRS: 288432 -> 287336 (-0.38 %)
Spilled SGPRs: 94 -> 65 (-30.85 %)
Code Size: 11548412 -> 11313896 (-2.03 %) bytes
Max Waves: 86460 -> 86759 (0.35 %)
This gives a really tiny boost.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=0b9a06a1a0e4f4b7130a5c372d13b586a8d66878
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Tue Feb 26 13:42:27 2019 +0100
radv: store more vertex attribute infos as pipeline keys
They are required for using typed buffer loads.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=489dac0d21baf069cf0045e785330eb1b16094a4
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Tue Feb 26 13:42:26 2019 +0100
ac: rework typed buffers loads for LLVM 7
Be more generic, this will be used by an upcoming series.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=56e04f67f906aea6101ba6081c5b0efcc25999cc
Author: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Date: Mon Mar 11 13:35:27 2019 +0100
panfrost: Set bo->gem_handle when creating a linear BO
So we can free it later.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa at rosenzweig.io>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=bfbad30543dd896459b09e0e05bc70ea1727e0b9
Author: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Date: Mon Mar 11 13:34:53 2019 +0100
panfrost: Set bo->size[0] in the DRM backend
So we can unmap it later.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso at collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa at rosenzweig.io>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=3570d15b6d88bdcd353b31ffe5460d04a88b7b6f
Author: Kenneth Graunke <kenneth at whitecape.org>
Date: Mon Mar 11 19:00:21 2019 -0700
intel/fs: Fix opt_peephole_csel to not throw away saturates.
We were not copying the saturate bit from the original instruction
to the new replacement instruction. This caused major misrendering
in DiRT Rally on iris, where comparisons leading to discards failed
due to the missing saturate, causing lots of extra garbage pixels to
be drawn in text rendering, trees, and so on.
This did not show up on i965 because st/nir performs a more aggressive
version of nir_opt_peephole_select, yielding more b32csel operations.
Fixes: 52c7df1643e i965/fs: Merge CMP and SEL into CSEL on Gen8+
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=bd17bdc56b34a08c421172df27fe07294c7a7024
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date: Mon Mar 11 20:43:15 2019 -0500
glsl/lower_vector_derefs: Don't use a temporary for TCS outputs
Tessellation control shader outputs act as if they have memory backing
them and you can have multiple writes to different components of the
same vector in-flight at the same time. When this happens, the load vec
store pattern that gets used by ir_triop_vector_insert doesn't yield the
correct results. Instead, just emit a sequence of conditional
assignments.
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
Cc: mesa-stable at lists.freedesktop.org
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=20c4578c5539de909e94a6acc3ad680ab2ddeca6
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date: Mon Mar 11 21:01:34 2019 -0500
glsl/list: Add a list variant of insert_after
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
Caio Marcelo de Oliveira Filho <caio.oliveira at intel.com>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=83fdefc06287f6c8bbb3bb5bb4ccd36d653017a3
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date: Tue Mar 12 16:25:39 2019 -0500
nir/loop_unroll: Fix out-of-bounds access handling
The previous code was completely broken when it came to constructing the
undef values. I'm not sure how it ever worked. For the case of a copy
that reads an undefined value, we can just delete the copy because the
destination is a valid undefined value. This saves us the effort of
trying to construct a value for an arbitrary copy_deref intrinsic.
Fixes: e8a8937a04 "nir: add partial loop unrolling support"
Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=c056609c439da964db8344a8fde66aec4bd9c877
Author: Jason Ekstrand <jason.ekstrand at intel.com>
Date: Tue Mar 12 18:18:58 2019 -0500
anv: Ignore VkRenderPassInputAttachementAspectCreateInfo
We don't care about the information but there's no sense in throwing a
debug warning about it. It's harmless but annoying to users.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984
Reviewed-by: Sagar Ghuge <sagar.ghuge at intel.com>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=486b181fd758c246c2d1eaa1975a858e84d64c32
Author: Eric Anholt <eric at anholt.net>
Date: Tue Mar 12 14:59:21 2019 -0700
v3d: Fix leak of the renderonly struct on screen destruction.
This makes v3d match vc4's destroy path.
Fixes: e113b21cb779 ("v3d: Add renderonly support.")
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=0c874c18cd07539f56fede272a24b76f2946716f
Author: Eric Anholt <eric at anholt.net>
Date: Tue Mar 12 14:56:57 2019 -0700
v3d: Fix leak of the mem_ctx after the DAG refactor.
Noticed while trying to get a CTS run again.
Fixes: 33886474d646 ("v3d: Use the DAG datastructure for QPU instruction scheduling.")
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=acfd88204e886e671da97b895fd2d1ee39b61256
Author: Grigori Goronzy <greg at chown.ath.cx>
Date: Thu Aug 3 20:07:58 2017 +0200
glx: add support for GLX_ARB_create_context_no_error (v3)
v2: Only reject no-error contexts for too-old GL if we're actually
trying to create a no-error context (Adam Jackson)
v3: Fix share contexts (Adam Jackson)
Reviewed-by: Adam Jackson <ajax at redhat.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae77f1236862e73c1ac250898924c648d481bda4
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Tue Mar 12 21:49:42 2019 +0100
radv: set the maximum number of IBs per submit to 192
This fixes random SteamVR corruption, see
https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181
Fixes: 4d30f2c6f42 ("radv/winsys: remove the max IBs per submit limit for the fallback path")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas at basnieuwenhuizen.nl>
More information about the mesa-commit
mailing list