[Mesa-dev] [PATCH] amdgpu/winsys: adjust IB size based on buffer wait time
Marek Olšák
maraeo at gmail.com
Sat Apr 16 00:32:48 UTC 2016
On Fri, Apr 15, 2016 at 7:50 PM, Grigori Goronzy <greg at chown.ath.cx> wrote:
> Small IBs help to reduce stalls for workloads that require a lot of
> synchronization. On the other hand, if there is no notable
> synchronization, we can use a large IB size to slightly improve
> performance in some cases.
>
> This introduces tuning of the IB size based on feedback on the average
> buffer wait time. The average wait time is tracked with exponential
> smoothing.
> ---
> src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 2 ++
> src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 8 ++++++--
> src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 1 +
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> index 036301e..1e441e5 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
> @@ -195,6 +195,7 @@ static void *amdgpu_bo_map(struct pb_buffer *buf,
> return NULL;
> }
> }
> + bo->ws->buffer_wait_time_avg = (3 * bo->ws->buffer_wait_time_avg) / 4;
> } else {
> uint64_t time = os_time_get_nano();
>
> @@ -222,6 +223,7 @@ static void *amdgpu_bo_map(struct pb_buffer *buf,
> }
>
> bo->ws->buffer_wait_time += os_time_get_nano() - time;
> + bo->ws->buffer_wait_time_avg = (3 * bo->ws->buffer_wait_time_avg + os_time_get_nano() - time) / 4;
> }
> }
>
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> index 3ea0f3d..a9af0ce 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
> @@ -201,12 +201,16 @@ amdgpu_ctx_query_reset_status(struct radeon_winsys_ctx *rwctx)
> static bool amdgpu_get_new_ib(struct radeon_winsys *ws, struct amdgpu_ib *ib,
> struct amdgpu_cs_ib_info *info, unsigned ib_type)
> {
> + unsigned buffer_size = 128 * 1024 * 4;
> + unsigned ib_size = 32 * 1024 * 4;
> +
> /* Small IBs are better than big IBs, because the GPU goes idle quicker
> * and there is less waiting for buffers and fences. Proof:
> * http://www.phoronix.com/scan.php?page=article&item=mesa-111-si&num=1
> */
> - unsigned buffer_size = 128 * 1024 * 4;
> - unsigned ib_size = 20 * 1024 * 4;
> + uint64_t avg = ((struct amdgpu_winsys *)ws)->buffer_wait_time_avg;
> + if (avg > 1E4)
> + ib_size = 10 * 1024 * 4;
Some comment here wouldn't hurt. Also that comparison could use an
integer constant. (1e4 is double I think)
I like the idea.
Marek
More information about the mesa-dev
mailing list