[Mesa-dev] [PATCH 3/4] i965/drm: Searching for a cached buffer for reuse

Thu May 10 20:56:12 UTC 2018

On Friday, May 4, 2018 5:56:04 PM PDT James Xiong wrote:
> From: "Xiong, James" <james.xiong at intel.com>
> 
> Now that a bucket contains cached buffers with different sizes, go
> through its list and search for a cached buffer with enough size.
> 
> Signed-off-by: Xiong, James <james.xiong at intel.com>
> ---
>  src/mesa/drivers/dri/i965/brw_bufmgr.c | 21 +++++++++++++++------
>  src/util/list.h                        |  5 +++++
>  2 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> index 6a9b005..5235aa6 100644
> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
> @@ -281,7 +281,7 @@ cached_bo_for_size(struct brw_bufmgr *bufmgr,
>     assert(!(busy && zeroed));
>  
>     if(bucket != NULL && !list_empty(&bucket->head)) {
> -      struct brw_bo *bo;
> +      struct brw_bo *bo, *temp_bo;
>  retry:
>        bo = NULL;
>  
> @@ -292,8 +292,13 @@ retry:
>            * asked us to zero the buffer, we don't want this
>            * because we are going to mmap it.
>            */
> -         bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head);
> -         list_del(&bo->head);
> +         LIST_FOR_EACH_ENTRY_REV(temp_bo, &bucket->head, head) {
> +            if (temp_bo->size >= size) {
> +               bo = temp_bo;
> +               list_del(&bo->head);
> +               break;
> +            }
> +         }
>        } else {
>           /* For non-render-target BOs (where we're probably
>            * going to map it first thing in order to fill it
> @@ -302,9 +307,13 @@ retry:
>            * allocating a new buffer is probably faster than
>            * waiting for the GPU to finish.
>            */
> -         bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head);
> -         if (!brw_bo_busy(bo)) {
> -            list_del(&bo->head);
> +         LIST_FOR_EACH_ENTRY(temp_bo, &bucket->head, head) {
> +            if (temp_bo->size >= size &&
> +                !brw_bo_busy(temp_bo)) {
> +               bo = temp_bo;
> +               list_del(&bo->head);
> +               break;
> +            }
>           }
>        }
>  
> diff --git a/src/util/list.h b/src/util/list.h
> index 6edb750..9362072 100644
> --- a/src/util/list.h
> +++ b/src/util/list.h
> @@ -189,6 +189,11 @@ static inline void list_validate(struct list_head *list)
>  	&pos->member != (head);						\
>  	pos = container_of(pos->member.next, pos, member))
>  
> +#define LIST_FOR_EACH_ENTRY_REV(pos, head, member)                      \
> +   for (pos = NULL, pos = container_of((head)->prev, pos, member);      \
> +        &pos->member != (head);                                         \
> +        pos = container_of(pos->member.prev, pos, member))
> +
>  #define LIST_FOR_EACH_ENTRY_SAFE(pos, storage, head, member)	\
>     for (pos = NULL, pos = container_of((head)->next, pos, member),	\
>  	storage = container_of(pos->member.next, pos, member);	\
> 

Hi James,

I don't particularly like the idea of introducing linear linked list
walks to find a buffer.  It's a really nice property to have every
buffer in the bucket be suitable and be able to grab one immediately.
It sounds like purging things more often in patch 4 reduces the cost of
this new list walk, but...it'd be nice to not add it in the first place.

This also conflicts with my new VMA allocator approach a lot, which
assumes that all buffers in a bucket are the same size...

Still, you've shown some pretty nice memory savings.  Scott and I were
wondering if it would be possible to achieve a similar effect by tuning
the bucket sizes - either adding more, or adjusting some.  The sizes we
have are pretty arbitrary - we started with powers of two, and then
added more intermediate sizes back in 2010 to reduce memory waste.

For example, 1920x1080 RGBA8888 Y-tiled buffers are going to be really
common, and should be 8,355,840 bytes...but the nearest bucket size is
8,388,608, so we waste 32KB.  We might want to make sure there's a
bucket at that exact size.

I've wondered whether we should just use a geometric series for the
bucket sizes, rather than the complex bucket_for_size() calculation
we have now.  We might also want to drop some of the huge ones...

--Ken
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20180510/567e2a04/attachment.sig>