[Intel-gfx] [RFC] libdrm_intel: Rework BO allocs to avoid rounding up to bucket size

Chris Wilson chris at chris-wilson.co.uk
Fri Aug 29 13:18:45 CEST 2014


On Fri, Aug 29, 2014 at 11:45:13AM +0100, Siluvery, Arun wrote:
> On 29/08/2014 11:16, Chris Wilson wrote:
> >On Fri, Aug 29, 2014 at 11:02:01AM +0100, Arun Siluvery wrote:
> >As a corollary to exact allocations, you can then reduce the number of
> >buckets again (the number was increased to allow finer-grained
> >allocations). Again, it is hard to judge whether handing back larger
> >objects will lead to memory wastage. So yet another statistic to track
> >is requested versus allocated memory sizes.
> >
> Reducing number of buckets would lead to more wastage of memory right?

That depends upon the distribution of sizes and allocation patterns. But
given a uniformly random pattern, the average size in each bucket would
go up. On the other hand, it should improve hit rates which is crucial
to keep memory on the GPU.
 
> The current bucket sizes are,
> Bucket[0]: 4K
> Bucket[1]: 8K
> Bucket[2]: 12K
> Bucket[3]: 16K
> Bucket[4]: 20K
> Bucket[5]: 24K
> Bucket[6]: 28K
> Bucket[7]: 32K
> Bucket[8]: 40K
> Bucket[9]: 48K
> Bucket[10]: 56K
> Bucket[11]: 64K
> Bucket[12]: 80K
> Bucket[13]: 96K
> Bucket[14]: 112K
> Bucket[15]: 128K
> Bucket[16]: 160K
> Bucket[17]: 192K
> Bucket[18]: 224K
> Bucket[19]: 256K
> ...
> ...
> 
> If there are more objects with size 132K we would end up allocating
> 160K.

But that's what you are changing, so I was wondering how the two
combined. (a) reduce the size of each fresh allocation, (b) allow reuse
of larger buffers.

> We can track requested vs allocated but that depends on the
> application and usage, what would be the best measure to track this?
> I mean we measure over a given time or any other criteria?

We only keep it around for a 1s. Hmm, knowing allocation patterns for
both frequently reallocation and the fairly static set would help. For
short allocations, overallocation is fine and vice versa. And that
should essentially guide the assignment of bucket sizes.

What I've done in the past is kept a global pool of used pages to avoid
clflushing fresh allocations in the kernel. That runs afoul of the ABI
that we need to scrub new bo. With a create2 we could specify a pool to
use that would be filp private and so have no greater information leak
than the current reuse. Probably will still be frowned upon, but a fun
idea to pursue...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre



More information about the Intel-gfx mailing list