[Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

Chia-I Wu olvaffe at gmail.com
Thu Sep 12 01:41:33 PDT 2013


On Thu, Sep 12, 2013 at 2:06 PM, Chris Forbes <chrisf at ijw.co.nz> wrote:
> Can we make this approximation conditional on an image-quality control
> in driconf [or somewhere else]?
Sure.  What would be the default behavior?

> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
>> From: Chia-I Wu <olv at lunarg.com>
>>
>> Replicate the gradient of the top-left pixel to the other three pixels in the
>> subspan, as how DDY is implemented.  Before, different graidents were used for
>> pixels in the top row and pixels in the bottom row.
>>
>> This change results in a less accurate approximation.  However, it improves
>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>> observed.
>>
>> No piglit gpu.tests regressions.
>>
>> I failed to come up with an explanation for the performance difference.  The
>> change does not make a difference on Ivy Bridge either.  If anyone has the
>> insight, please kindly enlighten me.  Performance differences may also be
>> observed on other games that call textureGrad and dFdx.
>>
>> Signed-off-by: Chia-I Wu <olv at lunarg.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++----
>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> index bfb3d33..c0d24a0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src
>>  void
>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct brw_reg src)
>>  {
>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell,
>> +    * which gives much better performance when the result is used with
>> +    * sample_d
>> +    */
>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>> +                                          BRW_VERTICAL_STRIDE_2;
>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>> +                                        BRW_WIDTH_2;
>> +
>>     struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>                                  BRW_REGISTER_TYPE_F,
>> -                                BRW_VERTICAL_STRIDE_2,
>> -                                BRW_WIDTH_2,
>> +                                vstride,
>> +                                width,
>>                                  BRW_HORIZONTAL_STRIDE_0,
>>                                  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>     struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>                                  BRW_REGISTER_TYPE_F,
>> -                                BRW_VERTICAL_STRIDE_2,
>> -                                BRW_WIDTH_2,
>> +                                vstride,
>> +                                width,
>>                                  BRW_HORIZONTAL_STRIDE_0,
>>                                  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>     brw_ADD(p, dst, src0, negate(src1));
>> --
>> 1.8.3.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev



-- 
olv at LunarG.com


More information about the mesa-dev mailing list