[Mesa-dev] [PATCH] st/mesa: don't do L3 thread pinning for Blender

Marek Olšák maraeo at gmail.com
Thu Nov 8 05:23:29 UTC 2018


Thanks a lot man. I'll reconsider this depending on the results I receive.

I may also just pin the Mesa threads and keep the app thread intact. It
should perform OK with glthread, but not without glthread.

Another option is to have the gallium and winsys threads "chase" the main
thread within the CPU by changing the thread affinity based on getcpu().

Marek

On Tue, Nov 6, 2018 at 4:50 PM Edmondo Tommasina <
edmondo.tommasina at gmail.com> wrote:

> Hi Marek
>
> It would be nice to have the driconf part of this patch committed in
> master to make it easy to test with and without the L3 pinning, so this
> patch is:
>
> Reviewed-by: Edmondo Tommasina <edmondo.tommasina at gmail.com>
>
> Now with this patch in place I'm starting to collect some numbers with
> and without the CCX affinity on my setup:
>
> CPU: AMD Ryzen 5 2600 Six-Core Processor
> GFX: AMD Radeon (TM) RX 470 Graphics (POLARIS10, DRM 3.27.0, 4.19.0-rc4,
> LLVM 8.0.0)
> RAM: G.Skill Flare X 3200 CL14
>
> drawoverhead
> ------------
> As expected great numbers with drawoverhead. For example:
>
> With L3 thread pinning:
>   29: DrawElements ( 1 VBO, 8 UBO,  8 Tex) w/ sample mask enable change:
>     6.91 million (99.5%)
>
> Without:
>   29: DrawElements ( 1 VBO, 8 UBO,  8 Tex) w/ sample mask enable change:
>     5.55 million (89.0%)
>
>
> Hitman Benchmark
> ----------------
> Here we have a performance loss.
>
> With L3 thread pinning:
>
> 5765 frames
>  50.21fps Average
>  10.16fps Min
> 137.31fps Max
>  19.92ms Average
>   7.28ms Min
>  98.42ms Max
>
>
> Without L3 thread pinning:
>
> 6024 frames
>  52.45fps Average
>  10.28fps Min
> 129.85fps Max
>  19.07ms Average
>   7.70ms Min
>  97.24ms Max
>
> With thread pinning I lose about 2 FPS on average.
>
> Looking at the CPU load of Hitman Benchmark:
>
> With thread pinnig we see as expected the first 3 cores (SMT active)
> working
> and the cores on the other CCX doing nothing:
>
> 09:46:50 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
> %guest  %gnice   %idle
> 09:46:53 PM  all   33.43    0.00    1.85    0.00    0.00    0.03    0.00
>   0.00    0.00   64.70
> 09:46:53 PM    0   68.79    0.00    4.03    0.00    0.00    0.00    0.00
>   0.00    0.00   27.18
> 09:46:53 PM    1   64.63    0.00    3.40    0.00    0.00    0.00    0.00
>   0.00    0.00   31.97
> 09:46:53 PM    2   68.46    0.00    3.69    0.00    0.00    0.00    0.00
>   0.00    0.00   27.85
> 09:46:53 PM    3   66.67    0.00    2.69    0.00    0.00    0.00    0.00
>   0.00    0.00   30.64
> 09:46:53 PM    4   66.89    0.00    3.04    0.00    0.00    0.00    0.00
>   0.00    0.00   30.07
> 09:46:53 PM    5   64.07    0.00    3.73    0.00    0.00    0.00    0.00
>   0.00    0.00   32.20
> 09:46:53 PM    6    0.67    0.00    0.34    0.00    0.00    0.00    0.00
>   0.00    0.00   98.99
> 09:46:53 PM    7    0.66    0.00    0.00    0.00    0.00    0.33    0.00
>   0.00    0.00   99.01
> 09:46:53 PM    8    0.33    0.00    0.00    0.00    0.00    0.00    0.00
>   0.00    0.00   99.67
> 09:46:53 PM    9    1.33    0.00    1.00    0.00    0.00    0.00    0.00
>   0.00    0.00   97.67
> 09:46:53 PM   10    0.33    0.00    0.00    0.00    0.00    0.00    0.00
>   0.00    0.00   99.67
> 09:46:53 PM   11    0.33    0.00    0.33    0.00    0.00    0.00    0.00
>   0.00    0.00   99.33
>
> Without pinning all cores are working:
>
> 09:32:07 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal
> %guest  %gnice   %idle
> 09:32:10 PM  all   42.77    0.00    3.48    0.03    0.00    0.03    0.00
>   0.00    0.00   53.70
> 09:32:10 PM    0   48.14    0.00    4.41    0.00    0.00    0.34    0.00
>   0.00    0.00   47.12
> 09:32:10 PM    1   37.71    0.00    3.37    0.00    0.00    0.00    0.00
>   0.00    0.00   58.92
> 09:32:10 PM    2   42.81    0.00    3.77    0.00    0.00    0.00    0.00
>   0.00    0.00   53.42
> 09:32:10 PM    3   44.63    0.00    3.02    0.00    0.00    0.00    0.00
>   0.00    0.00   52.35
> 09:32:10 PM    4   44.44    0.00    2.69    0.00    0.00    0.00    0.00
>   0.00    0.00   52.86
> 09:32:10 PM    5   43.48    0.00    3.34    0.00    0.00    0.00    0.00
>   0.00    0.00   53.18
> 09:32:10 PM    6   45.30    0.00    3.69    0.00    0.00    0.00    0.00
>   0.00    0.00   51.01
> 09:32:10 PM    7   46.31    0.00    3.02    0.00    0.00    0.00    0.00
>   0.00    0.00   50.67
> 09:32:10 PM    8   38.46    0.00    4.35    0.00    0.00    0.00    0.00
>   0.00    0.00   57.19
> 09:32:10 PM    9   35.35    0.00    4.04    0.34    0.00    0.00    0.00
>   0.00    0.00   60.27
> 09:32:10 PM   10   43.81    0.00    3.34    0.00    0.00    0.00    0.00
>   0.00    0.00   52.84
> 09:32:10 PM   11   42.81    0.00    2.68    0.00    0.00    0.00    0.00
>   0.00    0.00   54.52
>
> So it could be if an application takes advantage of many cores, the
> L3 thread pinning could negate the internal mesa benefits on my CPU.
>
> I'll try to collect more numbers, but right now I have the feeling, it
> would be good to commit the driconf option and make the test with and
> without thread pinning easier with different games and setups.
>
> Regards
> edmondo
>
>
>
> On Tue, Oct 30, 2018 at 11:39 PM Marek Olšák <maraeo at gmail.com> wrote:
>
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> so that all Blender threads are not forced to be on 1 CCX.
>>
>> Fixes: 8d473f555a0
>> ---
>>  src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 +
>>  src/gallium/include/state_tracker/st_api.h          | 1 +
>>  src/gallium/state_trackers/dri/dri_screen.c         | 2 ++
>>  src/mesa/state_tracker/st_context.c                 | 1 +
>>  src/mesa/state_tracker/st_context.h                 | 1 +
>>  src/mesa/state_tracker/st_manager.c                 | 8 +++++---
>>  src/util/00-mesa-defaults.conf                      | 4 ++++
>>  src/util/xmlpool/t_options.h                        | 5 +++++
>>  8 files changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
>> b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
>> index 9db0dc01117..daa7ce7f6cc 100644
>> --- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
>> +++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
>> @@ -24,17 +24,18 @@ DRI_CONF_SECTION_DEBUG
>>     DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false")
>>     DRI_CONF_ALLOW_GLSL_BUILTIN_CONST_EXPRESSION("false")
>>     DRI_CONF_ALLOW_GLSL_RELAXED_ES("false")
>>     DRI_CONF_ALLOW_GLSL_BUILTIN_VARIABLE_REDECLARATION("false")
>>     DRI_CONF_ALLOW_GLSL_CROSS_STAGE_INTERPOLATION_MISMATCH("false")
>>     DRI_CONF_ALLOW_HIGHER_COMPAT_VERSION("false")
>>     DRI_CONF_FORCE_GLSL_ABS_SQRT("false")
>>     DRI_CONF_GLSL_CORRECT_DERIVATIVES_AFTER_DISCARD("false")
>>     DRI_CONF_ALLOW_GLSL_LAYOUT_QUALIFIER_ON_FUNCTION_PARAMETERS("false")
>>     DRI_CONF_FORCE_COMPAT_PROFILE("false")
>> +   DRI_CONF_DISABLE_L3_THREAD_PINNING("false")
>>  DRI_CONF_SECTION_END
>>
>>  DRI_CONF_SECTION_MISCELLANEOUS
>>     DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false")
>>     DRI_CONF_GLSL_ZERO_INIT("false")
>>     DRI_CONF_ALLOW_RGB10_CONFIGS("true")
>>  DRI_CONF_SECTION_END
>> diff --git a/src/gallium/include/state_tracker/st_api.h
>> b/src/gallium/include/state_tracker/st_api.h
>> index 2b63b8a3d2a..26b52f8dc51 100644
>> --- a/src/gallium/include/state_tracker/st_api.h
>> +++ b/src/gallium/include/state_tracker/st_api.h
>> @@ -224,20 +224,21 @@ struct st_config_options
>>     unsigned force_glsl_version;
>>     boolean allow_glsl_extension_directive_midshader;
>>     boolean allow_glsl_builtin_const_expression;
>>     boolean allow_glsl_relaxed_es;
>>     boolean allow_glsl_builtin_variable_redeclaration;
>>     boolean allow_higher_compat_version;
>>     boolean glsl_zero_init;
>>     boolean force_glsl_abs_sqrt;
>>     boolean allow_glsl_cross_stage_interpolation_mismatch;
>>     boolean allow_glsl_layout_qualifier_on_function_parameters;
>> +   boolean disable_L3_thread_pinning;
>>     unsigned char config_options_sha1[20];
>>  };
>>
>>  /**
>>   * Represent the attributes of a context.
>>   */
>>  struct st_context_attribs
>>  {
>>     /**
>>      * The profile and minimal version to support.
>> diff --git a/src/gallium/state_trackers/dri/dri_screen.c
>> b/src/gallium/state_trackers/dri/dri_screen.c
>> index 82a0988a634..b8bd92475cb 100644
>> --- a/src/gallium/state_trackers/dri/dri_screen.c
>> +++ b/src/gallium/state_trackers/dri/dri_screen.c
>> @@ -80,20 +80,22 @@ dri_fill_st_options(struct dri_screen *screen)
>>        driQueryOptionb(optionCache,
>> "allow_glsl_builtin_variable_redeclaration");
>>     options->allow_higher_compat_version =
>>        driQueryOptionb(optionCache, "allow_higher_compat_version");
>>     options->glsl_zero_init = driQueryOptionb(optionCache,
>> "glsl_zero_init");
>>     options->force_glsl_abs_sqrt =
>>        driQueryOptionb(optionCache, "force_glsl_abs_sqrt");
>>     options->allow_glsl_cross_stage_interpolation_mismatch =
>>        driQueryOptionb(optionCache,
>> "allow_glsl_cross_stage_interpolation_mismatch");
>>     options->allow_glsl_layout_qualifier_on_function_parameters =
>>        driQueryOptionb(optionCache,
>> "allow_glsl_layout_qualifier_on_function_parameters");
>> +   options->disable_L3_thread_pinning =
>> +      driQueryOptionb(optionCache, "disable_L3_thread_pinning");
>>
>>     driComputeOptionsSha1(optionCache, options->config_options_sha1);
>>  }
>>
>>  static unsigned
>>  dri_loader_get_cap(struct dri_screen *screen, enum dri_loader_cap cap)
>>  {
>>     const __DRIdri2LoaderExtension *dri2_loader =
>> screen->sPriv->dri2.loader;
>>     const __DRIimageLoaderExtension *image_loader =
>> screen->sPriv->image.loader;
>>
>> diff --git a/src/mesa/state_tracker/st_context.c
>> b/src/mesa/state_tracker/st_context.c
>> index 354876746f4..4b19b140bcd 100644
>> --- a/src/mesa/state_tracker/st_context.c
>> +++ b/src/mesa/state_tracker/st_context.c
>> @@ -460,20 +460,21 @@ st_create_context_priv(struct gl_context *ctx,
>> struct pipe_context *pipe,
>>        screen->get_param(screen, PIPE_CAP_QUERY_TIME_ELAPSED);
>>     st->has_half_float_packing =
>>        screen->get_param(screen, PIPE_CAP_TGSI_PACK_HALF_FLOAT);
>>     st->has_multi_draw_indirect =
>>        screen->get_param(screen, PIPE_CAP_MULTI_DRAW_INDIRECT);
>>
>>     st->has_hw_atomics =
>>        screen->get_shader_param(screen, PIPE_SHADER_FRAGMENT,
>>                                 PIPE_SHADER_CAP_MAX_HW_ATOMIC_COUNTERS)
>>        ? true : false;
>> +   st->disable_L3_thread_pinning = options->disable_L3_thread_pinning;
>>
>>     util_throttle_init(&st->throttle,
>>                        screen->get_param(screen,
>>
>>  PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGET));
>>
>>     /* GL limits and extensions */
>>     st_init_limits(pipe->screen, &ctx->Const, &ctx->Extensions, ctx->API);
>>     st_init_extensions(pipe->screen, &ctx->Const,
>>                        &ctx->Extensions, &st->options, ctx->API);
>>
>> diff --git a/src/mesa/state_tracker/st_context.h
>> b/src/mesa/state_tracker/st_context.h
>> index 14b9b018809..e57873dafe8 100644
>> --- a/src/mesa/state_tracker/st_context.h
>> +++ b/src/mesa/state_tracker/st_context.h
>> @@ -121,20 +121,21 @@ struct st_context
>>     boolean has_shader_model3;
>>     boolean has_etc1;
>>     boolean has_etc2;
>>     boolean has_astc_2d_ldr;
>>     boolean prefer_blit_based_texture_transfer;
>>     boolean force_persample_in_shader;
>>     boolean has_shareable_shaders;
>>     boolean has_half_float_packing;
>>     boolean has_multi_draw_indirect;
>>     boolean can_bind_const_buffer_as_vertex;
>> +   boolean disable_L3_thread_pinning;
>>
>>     /**
>>      * If a shader can be created when we get its source.
>>      * This means it has only 1 variant, not counting glBitmap and
>>      * glDrawPixels.
>>      */
>>     boolean shader_has_one_variant[MESA_SHADER_STAGES];
>>
>>     boolean needs_texcoord_semantic;
>>     boolean apply_texture_swizzle_to_border_color;
>> diff --git a/src/mesa/state_tracker/st_manager.c
>> b/src/mesa/state_tracker/st_manager.c
>> index ceb48dd4903..eb0b88ef473 100644
>> --- a/src/mesa/state_tracker/st_manager.c
>> +++ b/src/mesa/state_tracker/st_manager.c
>> @@ -1067,24 +1067,26 @@ st_api_make_current(struct st_api *stapi, struct
>> st_context_iface *stctxi,
>>
>>        /* Purge the context's winsys_buffers list in case any
>>         * of the referenced drawables no longer exist.
>>         */
>>        st_framebuffers_purge(st);
>>
>>        /* Notify the driver that the context thread may have been changed.
>>         * This should pin all driver threads to a specific L3 cache for
>> optimal
>>         * performance on AMD Zen CPUs.
>>         */
>> -      struct glthread_state *glthread = st->ctx->GLThread;
>> -      thrd_t *upper_thread = glthread ? &glthread->queue.threads[0] :
>> NULL;
>> +      if (!st->disable_L3_thread_pinning) {
>> +         struct glthread_state *glthread = st->ctx->GLThread;
>> +         thrd_t *upper_thread = glthread ? &glthread->queue.threads[0] :
>> NULL;
>>
>> -      util_context_thread_changed(st->pipe, upper_thread);
>> +         util_context_thread_changed(st->pipe, upper_thread);
>> +      }
>>     }
>>     else {
>>        ret = _mesa_make_current(NULL, NULL, NULL);
>>     }
>>
>>     return ret;
>>  }
>>
>>
>>  static void
>> diff --git a/src/util/00-mesa-defaults.conf
>> b/src/util/00-mesa-defaults.conf
>> index a937c46d052..e9a6b817d9a 100644
>> --- a/src/util/00-mesa-defaults.conf
>> +++ b/src/util/00-mesa-defaults.conf
>> @@ -199,20 +199,24 @@ TODO: document the other workarounds.
>>          </application>
>>
>>          <application name="Wolfenstein The Old Blood"
>> executable="WolfOldBlood_x64.exe">
>>              <option name="force_compat_profile" value="true" />
>>          </application>
>>
>>          <application name="ARMA 3" executable="arma3.x86_64">
>>              <option name="glsl_correct_derivatives_after_discard"
>> value="true"/>
>>          </application>
>>
>> +        <application name="Blender" executable="blender">
>> +            <option name="disable_L3_thread_pinning" value="true"/>
>> +        </application>
>> +
>>          <!-- The GL thread whitelist is below, workarounds are above.
>>               Keep it that way. -->
>>
>>          <application name="Alien Isolation" executable="AlienIsolation">
>>              <option name="mesa_glthread" value="true"/>
>>          </application>
>>
>>          <application name="BioShock Infinite" executable="bioshock.i386">
>>              <option name="mesa_glthread" value="true"/>
>>          </application>
>> diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h
>> index e0a30f5fd1d..5d916519794 100644
>> --- a/src/util/xmlpool/t_options.h
>> +++ b/src/util/xmlpool/t_options.h
>> @@ -138,20 +138,25 @@ DRI_CONF_OPT_END
>>  #define DRI_CONF_ALLOW_GLSL_LAYOUT_QUALIFIER_ON_FUNCTION_PARAMETERS(def)
>> \
>>  DRI_CONF_OPT_BEGIN_B(allow_glsl_layout_qualifier_on_function_parameters,
>> def) \
>>          DRI_CONF_DESC(en,gettext("Allow layout qualifiers on function
>> parameters.")) \
>>  DRI_CONF_OPT_END
>>
>>  #define DRI_CONF_FORCE_COMPAT_PROFILE(def) \
>>  DRI_CONF_OPT_BEGIN_B(force_compat_profile, def) \
>>          DRI_CONF_DESC(en,gettext("Force an OpenGL compatibility
>> context")) \
>>  DRI_CONF_OPT_END
>>
>> +#define DRI_CONF_DISABLE_L3_THREAD_PINNING(def) \
>> +DRI_CONF_OPT_BEGIN_B(disable_L3_thread_pinning, def) \
>> +        DRI_CONF_DESC(en,gettext("Disable L3 thread pinning.")) \
>> +DRI_CONF_OPT_END
>> +
>>  /**
>>   * \brief Image quality-related options
>>   */
>>  #define DRI_CONF_SECTION_QUALITY \
>>  DRI_CONF_SECTION_BEGIN \
>>          DRI_CONF_DESC(en,gettext("Image Quality"))
>>
>>  #define DRI_CONF_PRECISE_TRIG(def) \
>>  DRI_CONF_OPT_BEGIN_B(precise_trig, def) \
>>          DRI_CONF_DESC(en,gettext("Prefer accuracy over performance in
>> trig functions")) \
>> --
>> 2.17.1
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181108/6e598a0b/attachment-0001.html>


More information about the mesa-dev mailing list