[Mesa-stable] [Mesa-dev] [PATCH 1/1] i965: Make sure the shadow buffers have enough space
James Xiong
james.xiong at intel.com
Fri Apr 13 21:37:05 UTC 2018
On Fri, 13 Apr 2018 14:33:09 -0700
Kenneth Graunke <kenneth at whitecape.org> wrote:
> On Friday, April 13, 2018 2:08:40 PM PDT James Xiong wrote:
> > On Fri, 13 Apr 2018 13:51:02 -0700
> > Kenneth Graunke <kenneth at whitecape.org> wrote:
> >
> > > On Friday, April 13, 2018 1:35:45 PM PDT Kenneth Graunke wrote:
> > > > On Monday, April 9, 2018 4:06:16 PM PDT James Xiong wrote:
> > > > > From: "Xiong, James" <james.xiong at intel.com>
> > > > >
> > > > > On non-LLC platforms, we malloc shadow batch/state buffers
> > > > > of the same sizes as our batch/state buffers' GEM allocations.
> > > > > However the buffer allocator reuses similar-sized gem objects,
> > > > > it returns a buffer larger than we asked for in some cases
> > > > > and we end up with smaller shadow buffers. If we utilize the
> > > > > full-size of the over-allocated batch/state buffers, we may
> > > > > wind up accessing beyond the bounds of the shadow buffers and
> > > > > cause segmentation fault and/or memory corruption.
> > > >
> > > > Oh, good catch! We do indeed malloc too little if the bufmgr
> > Thank you for taking time to review, Kenneth.
> > > >
> > > > > A few examples:
> > > > > case batch state
> > > > > request bo shadow request bo shadow
> > > > > init 0 20K 20K 20K 16K 16K 16K
> > > > > grow_buffer 1 30K 32K 30K 24K 24K 24K
> > > > > grow_buffer 2 48K 48K 48K 36K 40K 36K
> > > > > grow_buffer 3 72K 80K 72K 60K 64K 60K
> > > > > grow_buffer 4 120K 128K 120K - - -
> > > > >
> > > > > batch #1, #3, #4; state #2 and #3 are problematic. We can
> > > > > change the order to allocate the bo first, then allocate the
> > > > > shadow buffer using the bo's size so that the shadow buffer
> > > > > have at least an equivalent size of the gem allocation.
> > > > >
> > > > > Another problem: even though the state/batch buffer could
> > > > > grow, when checking if it runs out space, we always compare
> > > > > with the initial batch/state sizes. To utilize the entire
> > > > > buffers, change to compare with the actual sizes.
> > > >
> > > > This is actually intentional. Our goal is to flush batches
> > > > when the amount of commands or state reaches those thresholds.
> > > > Occasionally, we'll be in the middle of emitting a draw, and
> > > > unable to stop. In that case, we grow the batch and keep
> > > > going. But after that, we're beyond our original target, so we
> > > > flush next time. We don't want to grow without bounds...it's
> > > > meant more for emergencies, or if we've badly estimated the
> > > > size of the draw call.
> > I am not sure I get it. Let me give an example: the state buffer
> > gets grown once from 16K to 24K in brw_state_batch(), the used_size
> > becomes 20K, then brw_require_statebuffer_space(1024) gets called to
> > ask for 1K space, with the original logical, it compares the used
> > size with 16K and flush the batch even though the state buffer
> > still has 4K space available?
>
> Yes, the idea is to flush at around 16K of state. If we happen to be
> in the middle of a draw and run out of space, we'll grow to 24K.
> Once it's over 16K, we flush as soon as we can.
>
> We'd like to be fairly consistent on our batch size. Running larger
> batches can lead to differences in performance, and large batches can
> lead to degradation in the interactivity of the system (especially on
> GPUs without preemption).
>
> The hope is to grow once, at most. If we check against the BO size,
> we might grow repeatedly, which would lead to really large batches and
> things would get out of hand.
I see, thanks for the explanation, Kenneth.
>
> > > >
> > > > I've sent a simpler patch which I think should hopefully fix
> > > > your bug: https://patchwork.freedesktop.org/patch/217107/
> > >
> > > Lionel noticed that I botched that patch. Here's an actual one:
> > >
> > > https://patchwork.freedesktop.org/patch/217108/
> > Yes it will fix the existing bug. However the assumption here is
> > that the init allocation size will NOT be rounded up as it happens
> > to be the bucket size.
> > I am working on an optimization to improve memory usage(that's how
> > I find out this bug), this assumption is no longer true.
> > Essentially the bufmgr could return a buffer with the same or
> > larger size whether it is same as the bucket's or not. Anyway I
> > guess I can send the fix later along with the optimization
> > patches.
>
> Ah, that's a good point. Your patch also tries to use the BO size
> for the initial malloc as well, which is a good idea...
So what do you want? you want me to change this patch and sent for
review or take yours for now.
More information about the mesa-stable
mailing list