<div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Tue, Dec 11, 2018 at 5:31 PM Rafael Antognolli <<a href="mailto:rafael.antognolli@intel.com">rafael.antognolli@intel.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Dec 10, 2018 at 11:10:02PM -0600, Jason Ekstrand wrote:<br>
> <br>
> <br>
> On Mon, Dec 10, 2018 at 5:48 PM Rafael Antognolli <<a href="mailto:rafael.antognolli@intel.com" target="_blank">rafael.antognolli@intel.com</a>><br>
> wrote:<br>
> <br>
> On Mon, Dec 10, 2018 at 04:56:40PM -0600, Jason Ekstrand wrote:<br>
> > On Fri, Dec 7, 2018 at 6:06 PM Rafael Antognolli <<br>
> <a href="mailto:rafael.antognolli@intel.com" target="_blank">rafael.antognolli@intel.com</a>><br>
> > wrote:<br>
> ><br>
> > This commit tries to rework the code that split and returns chunks<br>
> back<br>
> > to the state pool, while still keeping the same logic.<br>
> ><br>
> > The original code would get a chunk larger than we need and split it<br>
> > into pool->block_size. Then it would return all but the first one,<br>
> and<br>
> > would split that first one into alloc_size chunks. Then it would keep<br>
> > the first one (for the allocation), and return the others back to the<br>
> > pool.<br>
> ><br>
> > The new anv_state_pool_return_chunk() function will take a chunk<br>
> (with<br>
> > the alloc_size part removed), and a small_size hint. It then splits<br>
> that<br>
> > chunk into pool->block_size'd chunks, and if there's some space still<br>
> > left, split that into small_size chunks. small_size in this case is<br>
> the<br>
> > same size as alloc_size.<br>
> ><br>
> > The idea is to keep the same logic, but make it in a way we can reuse<br>
> it<br>
> > to return other chunks to the pool when we are growing the buffer.<br>
> > ---<br>
> > src/intel/vulkan/anv_allocator.c | 147<br>
> +++++++++++++++++++++----------<br>
> > 1 file changed, 102 insertions(+), 45 deletions(-)<br>
> ><br>
> > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/<br>
> > anv_allocator.c<br>
> > index 31258e38635..bddeb4a0fbd 100644<br>
> > --- a/src/intel/vulkan/anv_allocator.c<br>
> > +++ b/src/intel/vulkan/anv_allocator.c<br>
> > @@ -994,6 +994,97 @@ anv_state_pool_get_bucket_size(uint32_t bucket)<br>
> > return 1 << size_log2;<br>
> > }<br>
> ><br>
> > +/** Helper to create a chunk into the state table.<br>
> > + *<br>
> > + * It just creates 'count' entries into the state table and update<br>
> their<br>
> > sizes,<br>
> > + * offsets and maps, also pushing them as "free" states.<br>
> > + */<br>
> > +static void<br>
> > +anv_state_pool_return_blocks(struct anv_state_pool *pool,<br>
> > + uint32_t chunk_offset, uint32_t count,<br>
> > + uint32_t block_size)<br>
> > +{<br>
> > + if (count == 0)<br>
> > + return;<br>
> > +<br>
> > + uint32_t st_idx = anv_state_table_add(&pool->table, count);<br>
> > + for (int i = 0; i < count; i++) {<br>
> > + /* update states that were added back to the state table */<br>
> > + struct anv_state *state_i = anv_state_table_get(&pool->table,<br>
> > + st_idx + i);<br>
> > + state_i->alloc_size = block_size;<br>
> > + state_i->offset = chunk_offset + block_size * i;<br>
> > + struct anv_pool_map pool_map = anv_block_pool_map(&pool-><br>
> block_pool,<br>
> > + state_i-><br>
> offset);<br>
> > + state_i->map = pool_map.map + pool_map.offset;<br>
> > + }<br>
> > +<br>
> > + uint32_t block_bucket = anv_state_pool_get_bucket(block_size);<br>
> > + anv_state_table_push(&pool->buckets[block_bucket].free_list,<br>
> > + &pool->table, st_idx, count);<br>
> > +}<br>
> > +<br>
> > +static uint32_t<br>
> > +calculate_divisor(uint32_t size)<br>
> > +{<br>
> > + uint32_t bucket = anv_state_pool_get_bucket(size);<br>
> > +<br>
> > + while (bucket >= 0) {<br>
> > + uint32_t bucket_size = anv_state_pool_get_bucket_size(bucket);<br>
> > + if (size % bucket_size == 0)<br>
> > + return bucket_size;<br>
> > + }<br>
> > +<br>
> > + return 0;<br>
> > +}<br>
> > +<br>
> > +/** Returns a chunk of memory back to the state pool.<br>
> > + *<br>
> > + * If small_size is zero, we split chunk_size into pool-><br>
> block_size'd<br>
> > pieces,<br>
> > + * and return those. If there's some remaining 'rest' space<br>
> (chunk_size is<br>
> > not<br>
> > + * divisble by pool->block_size), then we find a bucket size that is<br>
> a<br>
> > divisor<br>
> > + * of that rest, and split the 'rest' into that size, returning it<br>
> to the<br>
> > pool.<br>
> > + *<br>
> > + * If small_size is non-zero, we use it in two different ways:<br>
> > + * * if it is larger than pool->block_size, we split the chunk<br>
> into<br>
> > + * small_size'd pieces, instead of pool->block_size'd ones.<br>
> > + * * we also use it as the desired size to split the 'rest' after<br>
> we<br>
> > split<br>
> > + * the bigger size of the chunk into pool->block_size;<br>
> ><br>
> ><br>
> > This seems both overly complicated and not really what we want. If I<br>
> have a<br>
> > block size of 8k and allocate a single 64-byte state and then a 8k state,<br>
> it<br>
> > will break my almost 8k of padding into 511 64-byte states and return<br>
> those<br>
> > which may be very wasteful if the next thing I do is allocate a 1K state.<br>
> <br>
> Good, this would definitely be a waste.<br>
> <br>
> > It also doesn't provide the current alignment guarantees that all states<br>
> are<br>
> > aligned to their size. While the alignment guarantee doesn't matter for<br>
> most<br>
> > large states, it does matter for some of the smaller sizes. Now that I<br>
> look at<br>
> > it in detail, it appears that the highest alignment we ever require is<br>
> 64B and<br>
> > the smallest size we allow is 64B so maybe it just doesn't matter?<br>
> ><br>
> > Assuming we don't care about alignments, we could do something like this?<br>
> ><br>
> > if (small_size > 0) {<br>
> > assert(chunk_size % small_size == 0);<br>
> > anv_state_pool_return_blocks(pool, chunk_offset, chunk_size /<br>
> small_size,<br>
> > small_size);<br>
> > } else {<br>
> > uint32_t divisor = MAX_STATE_SIZE;<br>
> > while (divisor >= MIN_STATE_SIZE) {<br>
> > uint32_t nblocks = chunk_size / divisor;<br>
> > if (nblocks > 0) {<br>
> > anv_state_pool_return_blocs(pool, chunk_offset, nblocks,<br>
> divisor);<br>
> > chunk_offset += nblocks * divisor;<br>
> > chunk_size -= nblocks * divisor;<br>
> > }<br>
> > }<br>
> > }<br>
> <br>
> The code above is indeed way simpler and it does seem to achieve<br>
> what we want for the "return padding" case.<br>
> <br>
> However, we also use this helper for returning blocks that we got from<br>
> the freelist, but were only partially used. For instance if we need to<br>
> allocate a state of size 64 bytes, and we have a block of 8KB in the<br>
> freelist, due to the snippet above (small_size == 64) we will end up<br>
> splitting it into 511 64-byte states too.<br>
> <br>
> Maybe (if we want to keep the old semantics), in the case where<br>
> small_size > 0, we have to do something like:<br>
> <br>
> if (small_size > 0) {<br>
> assert(chunk_size % small_size == 0);<br>
> if (chunk_size > pool->block_size) {<br>
> assert(chunk_size % pool->block_size == 0);<br>
> uint32_t nblocks = chunk_size / pool->block_size - 1;<br>
> anv_state_pool_return_blocks(pool, offset, nblocks, pool-><br>
> block_size);<br>
> chunk_size -= nblocks * pool->block_size;<br>
> offset += nblocks * pool->block_size;<br>
> }<br>
> anv_state_pool_return_blocks(pool, chunk_offset, chunk_size /<br>
> small_size, small_size);<br>
> } else {<br>
> ... your else clause here<br>
> }<br>
> <br>
> Maybe it's over complicating things again... what do you think?<br>
> <br>
> <br>
> Maybe? Or maybe we want to just keep the block_size semantics for everything<br>
> and split padding into some small chunks as needed and then a bunch of<br>
> block_size chunks. If we did that, we'd have the same semantics for<br>
> everything.<br>
<br>
Just to be sure, how does something like this look:<br>
<br>
uint32_t divisor = MAX2(pool->block_size, small_size);<br>
uint32_t nblocks = chunk_size / divisor;<br>
uint32_t rest = chunk_size - nblocks * divisor;<br>
<br>
/* First return pool->block_size'd chunks.*/<br>
uint32_t offset = chunk_offset + rest;<br>
anv_state_pool_return_blocks(pool, offset, nblocks, divisor);<br></blockquote><div><br></div><div>oh.... That's clever. We can't assume that the start is aligned but we can always assume that the end is aligned. I like it! Probably deserves a comment and an assert though. :-)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
chunk_size = rest;<br>
<br>
uint32_t b = anv_state_pool_get_bucket(divisor);<br></blockquote><div><br></div><div>I think we want divisor to start off at small_size here if small_size > 0 && small_size < bucket_size.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
while (chunk_size > 0 && b >= ANV_MIN_STATE_SIZE_LOG2) {<br>
divisor = anv_state_pool_get_bucket_size(b);<br>
nblocks = chunk_size / divisor;<br>
rest = chunk_size - nblocks * divisor;<br>
if (nblocks > 0) {<br>
anv_state_pool_return_blocks(pool, chunk_offset + rest,<br>
nblocks, divisor);<br>
chunk_size = rest;<br>
}<br>
b--;<br>
}<br>
<br>
I'm also wondering if it's worth leaving the "small_size" portion of it,<br>
and if so I probably need a better name for it.<br></blockquote><div><br></div><div>Yeah, I think this is more-or-less what we want. Not sure if I like decrementing b or just doing "divisor /= 2" better; probably doesn't matter.<br></div><div><br></div><div>--Jason<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> <br>
<br>
<snip><br>
<br>
> <br>
> <br>
> I think what I'd recommend is that you pull the first helper in this patch into<br>
> it's own patch right before patch 2 and use it replace the two in-line copies<br>
> here. Then the API wouldn't look nearly as horrible and this could focus on<br>
> the heuristic for dealing with padding (and maybe even be rolled into the<br>
> padding patch).<br>
<br>
Cool, done locally already, thanks!<br>
<br>
> <br>
> > - chunk_size = pool->block_size;<br>
> > - }<br>
> > -<br>
> > - assert(chunk_size % alloc_size == 0);<br>
> > - uint32_t push_back = (chunk_size / alloc_size) - 1;<br>
> > - uint32_t st_idx = anv_state_table_add(&pool->table,<br>
> push_back);<br>
> > - for (int i = 0; i < push_back; i++) {<br>
> > - /* update states that were added back to the state table<br>
> */<br>
> > - struct anv_state *state_i = anv_state_table_get(&pool-><br>
> table,<br>
> > - st_idx +<br>
> i);<br>
> > - state_i->alloc_size = alloc_size;<br>
> > - state_i->offset = chunk_offset + alloc_size * (i + 1);<br>
> > - struct anv_pool_map pool_map = anv_block_pool_map(&<br>
> pool-><br>
> > block_pool,<br>
> > - <br>
> state_i-><br>
> > offset);<br>
> > - state_i->map = pool_map.map + pool_map.offset;<br>
> > - }<br>
> > - anv_state_table_push(&pool->buckets[bucket].free_list,<br>
> > - &pool->table, st_idx, push_back);<br>
> > -<br>
> > - offset = chunk_offset;<br>
> > + uint32_t return_offset = chunk_offset + alloc_size;<br>
> > + uint32_t return_size = chunk_size - alloc_size;<br>
> > + anv_state_pool_return_chunk(pool, return_offset,<br>
> > + return_size, alloc_size);<br>
> > goto done;<br>
> > }<br>
> > }<br>
> > --<br>
> > 2.17.1<br>
> ><br>
> > _______________________________________________<br>
> > mesa-dev mailing list<br>
> > <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
> > <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
> ><br>
> <br>
</blockquote></div></div>