[Mesa-dev] [PATCH] llvmpipe: use simple coeffs calc for 128bit vectors
Roland Scheidegger
sroland at vmware.com
Tue Nov 3 17:49:49 PST 2015
Ok I'm convinced enough it's not worth bothering about the (mostly
minimal) performance impact and pushed this (a slightly altered version).
Thanks!
Roland
Am 03.11.2015 um 09:36 schrieb Oded Gabbay:
> There are currently two methods in llvmpipe code to calculate coeffs to
> be used as inputs for the fragment shader. The two methods use slightly
> different ways to do the floating point calculations and thus produce
> slightly different results.
>
> The decision which method to use is determined by the size of the vector
> that is used by the platform.
>
> For vectors with size of more than 128bit, a single-step method is used,
> in which coeffs_init_simple() + attribs_update_simple() are called.
>
> For vectors with size of 128bit or less, a two-step method is used, in
> which coeffs_init() + attribs_update() are called.
>
> This causes some piglit tests (clip-distance-bulk-copy,
> interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with
> 128bit vectors (such as ppc64le or x86-64 without AVX).
>
> This patch makes platforms with 128bit vectors use the single-step
> method (aka "simple" method) instead of the two-step method.
> This would make the resulting coeffs identical between more platforms,
> make sure the piglit tests passes, and make debugging and maintainability
> a bit easier as the generated LLVM IR will be the same for more platforms.
>
> The performance impact is negligible for x86-64 without AVX, and
> basically non-existent for ppc64le, as it can be seen from the following
> benchmarking results:
>
> - glxspheres, on ppc64le:
>
> - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec
> - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec
> - Additional 0.8% performance boost
>
> - glxspheres, on x86-64 without AVX:
>
> - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec
> - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec
> - Additional 0.74% performance boost
>
> - glmark2, on ppc64le:
>
> - original code: score of 58
> - with my change: score of 57
>
> - glmark2, on x86-64 without AVX:
>
> - original code: score of 175
> - with the patch: score of 167
> - Impact of of -4.5% on performance
>
> - OpenArena, on ppc64le:
>
> - original code: 3398 frames 1719.0 seconds 2.0 fps
> 255.0/505.9/2773.0/0.0 ms
>
> - with the patch: 3398 frames 1690.4 seconds 2.0 fps
> 241.0/497.5/2563.0/0.2 ms
>
> - 29 seconds faster with the patch, which is about 2%
>
> - OpenArena, on x86-64 without AVX:
>
> - original code: 3398 frames 239.6 seconds 14.2 fps
> 38.0/70.5/719.0/14.6 ms
>
> - with the patch: 3398 frames 244.4 seconds 13.9 fps
> 38.0/71.9/697.0/14.3 ms
>
> - 0.3 fps slower with the patch (about 2%)
>
> Additional details can be found at:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_archives_mesa-2Ddev_2015-2DOctober_098635.html&d=BQIBAg&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=leupoMWQQSziy-ONBqVRNVTPLKwGiZIiJ4rAJTwPcp0&s=G-j7DINld6T77nYUd6diDitYgoXqgWdJEsmLk6vpDw4&e=
>
> Signed-off-by: Oded Gabbay <oded.gabbay at gmail.com>
> ---
> src/gallium/drivers/llvmpipe/lp_bld_interp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/llvmpipe/lp_bld_interp.c b/src/gallium/drivers/llvmpipe/lp_bld_interp.c
> index df262fa..a2055d2 100644
> --- a/src/gallium/drivers/llvmpipe/lp_bld_interp.c
> +++ b/src/gallium/drivers/llvmpipe/lp_bld_interp.c
> @@ -746,7 +746,7 @@ lp_build_interp_soa_init(struct lp_build_interp_soa_context *bld,
>
> pos_init(bld, x0, y0);
>
> - if (coeff_type.length > 4) {
> + if (coeff_type.length >= 4) {
> bld->simple_interp = TRUE;
> {
> /* XXX this should use a global static table */
>
More information about the mesa-dev
mailing list