[Intel-gfx] [PATCH 1/2] drm/i915: Enable THP on Icelake and beyond

Matthew Auld matthew.auld at intel.com
Mon May 9 10:49:20 UTC 2022


On 29/04/2022 11:04, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> We have a statement from HW designers that the GPU read regression when
> using 2M pages was fixed from Icelake onwards, which was also confirmed
> by bencharking Eero did last year:
> 
> """
> When IOMMU is disabled, enabling THP causes following perf changes on
> TGL-H (GT1):
> 
>      10-15% SynMark Batch[0-3]
>      5-10% MemBW GPU texture, SynMark ShMapVsm
>      3-5% SynMark TerrainFly* + Geom* + Fill* + CSCloth + Batch4
>      1-3% GpuTest Triangle, SynMark TexMem* + DeferredAA + Batch[5-7]
>            + few others
>      -7% MemBW GPU blend
> 
> In the above 3D benchmark names, * means all the variants of tests with
> the same prefix. For example "SynMark TexMem*", means both TexMem128 &
> TexMem512 tests in the synthetic (Intel internal) SynMark test suite.
> 
> In the (public, but proprietary) GfxBench & GLB(enchmark) test suites,
> there are both onscreen and offscreen variants of each test. Unless
> explicitly stated otherwise, numbers are for both variants.
> 
> All tests are run with FullHD monitor. All tests are fullscreen except
> for GLB and GpuTest ones, which are run in 1/2 screen window (GpuTest
> triangle is run both in fullscreen and 1/2 screen window).
> """
> 
> Since the only regression is MemBW GPU blend, against many more gains,
> it sounds it is time to enable THP on Gen11+.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/430
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Matthew Auld <matthew.auld at intel.com>
> Cc: Eero Tamminen <eero.t.tamminen at intel.com>

fwiw, for the series,
Reviewed-by: Matthew Auld <matthew.auld at intel.com>

> ---
>   drivers/gpu/drm/i915/gem/i915_gemfs.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gemfs.c b/drivers/gpu/drm/i915/gem/i915_gemfs.c
> index ee87874e59dc..c5a6bbc842fc 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gemfs.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gemfs.c
> @@ -28,12 +28,14 @@ int i915_gemfs_init(struct drm_i915_private *i915)
>   	 *
>   	 * One example, although it is probably better with a per-file
>   	 * control, is selecting huge page allocations ("huge=within_size").
> -	 * However, we only do so to offset the overhead of iommu lookups
> -	 * due to bandwidth issues (slow reads) on Broadwell+.
> +	 * However, we only do so on platforms which benefit from it, or to
> +	 * offset the overhead of iommu lookups, where with latter it is a net
> +	 * win even on platforms which would otherwise see some performance
> +	 * regressions such a slow reads issue on Broadwell and Skylake.
>   	 */
>   
>   	opts = NULL;
> -	if (i915_vtd_active(i915)) {
> +	if (GRAPHICS_VER(i915) >= 11 || i915_vtd_active(i915)) {
>   		if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
>   			opts = huge_opt;
>   			drm_info(&i915->drm,
> @@ -41,7 +43,10 @@ int i915_gemfs_init(struct drm_i915_private *i915)
>   				 opts);
>   		} else {
>   			drm_notice(&i915->drm,
> -				   "Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
> +				   "Transparent Hugepage support is recommended for optimal performance%s\n",
> +				   GRAPHICS_VER(i915) >= 11 ?
> +				   " on this platform!" :
> +				   " when IOMMU is enabled!");
>   		}
>   	}
>   


More information about the Intel-gfx mailing list