[Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan
Chris Forbes
chrisf at ijw.co.nz
Wed Sep 11 23:06:03 PDT 2013
Can we make this approximation conditional on an image-quality control
in driconf [or somewhere else]?
On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
> From: Chia-I Wu <olv at lunarg.com>
>
> Replicate the gradient of the top-left pixel to the other three pixels in the
> subspan, as how DDY is implemented. Before, different graidents were used for
> pixels in the top row and pixels in the bottom row.
>
> This change results in a less accurate approximation. However, it improves
> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
> 95.0% confidence) on Haswell. No noticeable image quality difference
> observed.
>
> No piglit gpu.tests regressions.
>
> I failed to come up with an explanation for the performance difference. The
> change does not make a difference on Ivy Bridge either. If anyone has the
> insight, please kindly enlighten me. Performance differences may also be
> observed on other games that call textureGrad and dFdx.
>
> Signed-off-by: Chia-I Wu <olv at lunarg.com>
> ---
> src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> index bfb3d33..c0d24a0 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src
> void
> fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct brw_reg src)
> {
> + /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell,
> + * which gives much better performance when the result is used with
> + * sample_d
> + */
> + unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
> + BRW_VERTICAL_STRIDE_2;
> + unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
> + BRW_WIDTH_2;
> +
> struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
> BRW_REGISTER_TYPE_F,
> - BRW_VERTICAL_STRIDE_2,
> - BRW_WIDTH_2,
> + vstride,
> + width,
> BRW_HORIZONTAL_STRIDE_0,
> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
> struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
> BRW_REGISTER_TYPE_F,
> - BRW_VERTICAL_STRIDE_2,
> - BRW_WIDTH_2,
> + vstride,
> + width,
> BRW_HORIZONTAL_STRIDE_0,
> BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
> brw_ADD(p, dst, src0, negate(src1));
> --
> 1.8.3.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list