[Glamor] glamor: Changes to 'master'

Thu Jul 12 03:52:01 PDT 2012

> -----Original Message-----
> From:
> glamor-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.org
> [mailto:glamor-bounces+zhigang.gong=linux.intel.com at lists.freedesktop.o
> rg] On Behalf Of Michel D?nzer
> Sent: Thursday, July 12, 2012 5:45 PM
> To: Zhigang Gong
> Cc: glamor at lists.freedesktop.org
> Subject: Re: [Glamor] glamor: Changes to 'master'
> 
> On Don, 2012-07-12 at 15:51 +0800, Zhigang Gong wrote:
> > On Wed, Jul 11, 2012 at 01:00:12PM +0200, Michel Dänzer wrote:
> > > On Mit, 2012-07-11 at 00:59 -0700, Zhigang Gong wrote:
> > > >
> > > > commit 593bd206bf2794dc1450e8e7d20142b8d0ca074b
> > > > Author: Zhigang Gong <zhigang.gong at linux.intel.com>
> > > > Date:   Tue Jun 26 15:39:24 2012 +0800
> > > >
> > > >     glamor_render: Don't allocate buffer for vbo each time.
> > > >
> > > >     We can reuse the last one if the last one is big enough
> > > >     to contain current vertext data. In the meantime, Use
> > > >     MapBufferRange instead of MapBuffer.
> > > >
> > > >     Testing shows, this patch brings some benefit for
> > > >     aa10text/rgb10text. Not too much, but indeed faster.
> > > >
> > > >     Signed-off-by: Zhigang Gong <zhigang.gong at linux.intel.com>
> > >
> > > This commit decreases glyph throughput by about a factor of 4 with
> the
> > > r600g driver on an RS880 laptop.
> > >
> > > AFAICT the new code always maps the VBO from offset 0, which I'm not
> > > sure can be expected to perform well. However, I've tried sliding the
> > > mapping range over a larger VBO, and it doesn't seem to help. Maybe
> the
> > > r600g driver doesn't handle this kind of usage very well yet.
> >
> > Without this commit, each time we start to flush the compositing, we
> > need to call glBufferData(GL_ARRAY_BUFFER,..) to allocate resources
> > on GPU each time. With this commit, we will allocate new buffer only
> > if current required buffer if larger than the allocated buffer. If
current
> > required buffer can be fit into the allocated buffer, we only need to
map
> > the needed range to vma.
> 
> The problem is that this commit always reuses the same range of the VBO
> that was previously used by glDrawElements. This forces the GL
> implementation to either copy the vertex data before submitting it to
> the GPU, or flush the previous rendering commands and wait for them to
> finish. Ideally, it should be possible to write the vertex data to the
> VBO once and have the GPU consume it without any intervening CPU
> copies
> or waits for GPU idle.
> 
> From my experience with EXA drivers, a good way to achieve that is to
> allocate a VBO large enough to keep the VBO allocation overhead low and
> to keep appending vertices to it until they no longer fit, at which
> point the next VBO needs to be allocated.
Agree, if two flushs are very close to each other, then reuse the same
vbo buffer may block the second flush and wait for the first to complete.

> 
> The patch below implements something like that. The GL API usage and
> VBO
> size could probably use some more tuning, but this already performs
> better for me than reverting commit
> 593bd206bf2794dc1450e8e7d20142b8d0ca074b or disabling VBO usage
> with
> your patch. How does it work for you?
It works fine. I will merge it latter, Thanks!
> 
> 
> > > Still, how would you feel about reverting this change for now?
> > You can see that this commit does save the time to create new vertex
> > buffers on GPU side each time. And on Intel's platform, this commit
> > can get about 3% gain on the x11perf -aa10text. So IMO, it's not a
> > good idea to revert it.
> 
> As a side note, a 3% gain on one platform vs. a 75% loss on another
> platform seems like a no-brainer to me. :)
> 
> 
> > I found an alternative way for yor, I added a new patch to disable the
vbo
> > usage at all. And we allocated vb array in system memory directly.
Please
> > check the below patch out and test on your platform. You just need to
> > change the use_vbo to 0 at glamor_priv.h. Does that work for you?
> 
> It does provide the same performance as reverting commit
> 593bd206bf2794dc1450e8e7d20142b8d0ca074b on this machine.
> However, again
> I'm not sure this allows the GL implementation to avoid CPU copies of
> the vertex data, which could be an issue with faster GPUs.
> 
> 
> commit c47545d0217b947155d90481e00f9fbbe3072bd4
> Author: Michel Dänzer <michel.daenzer at amd.com>
> Date:   Thu Jul 12 11:34:54 2012 +0200
> 
>     Stream vertex data to VBOs.
> 
> diff --git a/src/glamor_render.c b/src/glamor_render.c
> index d986e9a..966d7e4 100644
> --- a/src/glamor_render.c
> +++ b/src/glamor_render.c
> @@ -725,16 +725,6 @@ glamor_setup_composite_vbo(ScreenPtr screen,
> int n_verts)
>  	    glamor_get_screen_private(screen);
>  	glamor_gl_dispatch *dispatch;
>  	int vert_size;
> -	Bool need_new_buffer = FALSE;
> -
> -	glamor_priv->vbo_offset = 0;
> -	glamor_priv->render_nr_verts = 0;
> -	vert_size = n_verts * sizeof(float) * 2;
> -
> -	if (glamor_priv->vbo_size < vert_size) {
> -		glamor_priv->vbo_size = vert_size;
> -		need_new_buffer = TRUE;
> -	}
> 
>  	glamor_priv->vb_stride = 2 * sizeof(float);
>  	if (glamor_priv->has_source_coords)
> @@ -742,17 +732,26 @@ glamor_setup_composite_vbo(ScreenPtr
> screen, int n_verts)
>  	if (glamor_priv->has_mask_coords)
>  		glamor_priv->vb_stride += 2 * sizeof(float);
> 
> +	vert_size = n_verts * glamor_priv->vb_stride;
> +
>  	dispatch = glamor_get_dispatch(glamor_priv);
>  	dispatch->glBindBuffer(GL_ARRAY_BUFFER, glamor_priv->vbo);
>  	if (glamor_priv->gl_flavor == GLAMOR_GL_DESKTOP) {
> -		if (need_new_buffer)
> +		if (glamor_priv->vbo_size < (glamor_priv->vbo_offset +
> vert_size)) {
> +			glamor_priv->vbo_size =
> GLAMOR_COMPOSITE_VBO_VERT_CNT *
> +				glamor_priv->vb_stride;
> +			glamor_priv->vbo_offset = 0;
>  			dispatch->glBufferData(GL_ARRAY_BUFFER,
> -					       vert_size,
> -					       NULL, GL_DYNAMIC_DRAW);
> -		glamor_priv->vb =
> dispatch->glMapBufferRange(GL_ARRAY_BUFFER, 0,
> -						vert_size,
> -						GL_MAP_READ_BIT |
GL_MAP_WRITE_BIT);
> +					       glamor_priv->vbo_size,
> +					       NULL, GL_STREAM_DRAW);
> +		}
> +
> +		glamor_priv->vb =
> dispatch->glMapBufferRange(GL_ARRAY_BUFFER,
> +
glamor_priv->vbo_offset,
> +							     vert_size,
> +
GL_MAP_WRITE_BIT |
> GL_MAP_UNSYNCHRONIZED_BIT);
>  		assert(glamor_priv->vb != NULL);
> +		glamor_priv->vb -= glamor_priv->vbo_offset;
>  	}
>  	dispatch->glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,
> glamor_priv->ebo);
> 
> @@ -1428,6 +1427,7 @@ glamor_composite_with_shader(CARD8 op,
>  								&key,
shader,
>  								&op_info);
>  		}
> +		glamor_priv->render_nr_verts = 0;
>  	}
> 
>  	dispatch->glBindBuffer(GL_ARRAY_BUFFER, 0);
> 
> --
> Earthling Michel Dänzer           |
> http://www.amd.com
> Libre software enthusiast         |          Debian, X and DRI
> developer
> _______________________________________________
> Glamor mailing list
> Glamor at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/glamor