[Mesa-dev] [PATCH] i965/hsw: approximate DDX with a uniform value across a subspan

Chris Forbes chrisf at ijw.co.nz
Thu Sep 12 20:13:37 PDT 2013


Sounds good to me.

On Fri, Sep 13, 2013 at 3:11 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
> On Thu, Sep 12, 2013 at 10:48 PM, Ian Romanick <idr at freedesktop.org> wrote:
>> On 09/12/2013 01:06 AM, Chris Forbes wrote:
>>> Can we make this approximation conditional on an image-quality control
>>> in driconf [or somewhere else]?
>>
>> There's already a control that applications can use:
>> GL_FRAGMENT_SHADER_DERIVATIVE_HINT.  I don't know whether or not /any/
>> app has ever used it.  The default setting is GL_DONT_CARE, so,
>> technically speaking, we could do this optimization whenever the hint
>> isn't GL_NICEST.  Though, we may want a driconf override anyway.  Hmm...
> How about, in generate_ddx():
>
>   if (brw->ctx.Hint.FragmentShaderDerivative == GL_NICEST ||
>       brw->accurate_ddx) {
>      // current code
>   }
>   else {
>      // new code
>   }
>
> That is, when the app don't care, we treat it as GL_FASTEST.  If the
> user cares, he can set the new drirc option, accurate_ddx, to true to
> override.  accurate_ddx is false by default.
>
>>> On Thu, Sep 12, 2013 at 5:00 PM, Chia-I Wu <olvaffe at gmail.com> wrote:
>>>> From: Chia-I Wu <olv at lunarg.com>
>>>>
>>>> Replicate the gradient of the top-left pixel to the other three pixels in the
>>>> subspan, as how DDY is implemented.  Before, different graidents were used for
>>>> pixels in the top row and pixels in the bottom row.
>>>>
>>>> This change results in a less accurate approximation.  However, it improves
>>>> the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at
>>>> 95.0% confidence) on Haswell.  No noticeable image quality difference
>>>> observed.
>>>>
>>>> No piglit gpu.tests regressions.
>>>>
>>>> I failed to come up with an explanation for the performance difference.  The
>>>> change does not make a difference on Ivy Bridge either.  If anyone has the
>>>> insight, please kindly enlighten me.  Performance differences may also be
>>>> observed on other games that call textureGrad and dFdx.
>>>>
>>>> Signed-off-by: Chia-I Wu <olv at lunarg.com>
>>>> ---
>>>>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 17 +++++++++++++----
>>>>  1 file changed, 13 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>>> index bfb3d33..c0d24a0 100644
>>>> --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp
>>>> @@ -564,16 +564,25 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src
>>>>  void
>>>>  fs_generator::generate_ddx(fs_inst *inst, struct brw_reg dst, struct brw_reg src)
>>>>  {
>>>> +   /* approximate with ((ss0.tr - ss0.tl)x4 (ss1.tr - ss1.tl)x4) on Haswell,
>>>> +    * which gives much better performance when the result is used with
>>>> +    * sample_d
>>>> +    */
>>>> +   unsigned vstride = (brw->is_haswell) ? BRW_VERTICAL_STRIDE_4 :
>>>> +                                          BRW_VERTICAL_STRIDE_2;
>>>> +   unsigned width = (brw->is_haswell) ? BRW_WIDTH_4 :
>>>> +                                        BRW_WIDTH_2;
>>>> +
>>>>     struct brw_reg src0 = brw_reg(src.file, src.nr, 1,
>>>>                                  BRW_REGISTER_TYPE_F,
>>>> -                                BRW_VERTICAL_STRIDE_2,
>>>> -                                BRW_WIDTH_2,
>>>> +                                vstride,
>>>> +                                width,
>>>>                                  BRW_HORIZONTAL_STRIDE_0,
>>>>                                  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>>>     struct brw_reg src1 = brw_reg(src.file, src.nr, 0,
>>>>                                  BRW_REGISTER_TYPE_F,
>>>> -                                BRW_VERTICAL_STRIDE_2,
>>>> -                                BRW_WIDTH_2,
>>>> +                                vstride,
>>>> +                                width,
>>>>                                  BRW_HORIZONTAL_STRIDE_0,
>>>>                                  BRW_SWIZZLE_XYZW, WRITEMASK_XYZW);
>>>>     brw_ADD(p, dst, src0, negate(src1));
>>>> --
>>>> 1.8.3.1
>>>>
>>>> _______________________________________________
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
>
>
> --
> olv at LunarG.com


More information about the mesa-dev mailing list