[Libva] performances question with libva on i5 sandy bridge

Gilles Chanteperdrix gilles.chanteperdrix at xenomai.org
Tue Jan 20 23:46:11 PST 2015


On Wed, Jan 21, 2015 at 02:32:34PM +0800, Xiang, Haihao wrote:
> On Mon, 2015-01-19 at 07:37 +0100, Gilles Chanteperdrix wrote:
> > On Mon, Jan 19, 2015 at 10:34:37AM +0800, Xiang, Haihao wrote:
> > > On Mon, 2015-01-19 at 00:20 +0100, Gilles Chanteperdrix wrote:
> > > > Hi,
> > > > 
> > > > I am testing libva with ffmpeg on Linux to decode h264 video. 
> > > > Linux version is 3.4.29 
> > > > FFmpeg version is 2.5.3 
> > > > Mesa version is 10.4.0 
> > > > libdrm version is 2.4.58 
> > > > libva version is 1.5.0
> > > > 
> > > > From what I could gather from the documentation and examples, using
> > > > vaDeriveImage should be preferred if it is available. However, I
> > > > have compared, with top, the CPU consumed, and I observe that the
> > > > following code:
> > > > 
> > > > #ifdef USE_VADERIVEIMAGE
> > > > 	vrc = vaDeriveImage(ctx->display, buf->surface_id, &va_image);
> > > > 	CHECK_VASTATUS(vrc, "vaDeriveImage");
> > > > #else
> > > > 	vrc = vaGetImage(ctx->display, buf->surface_id,
> > > > 			0, 0, cctx->coded_width, cctx->coded_height,
> > > > 			va_image.image_id);
> > > > 	CHECK_VASTATUS(vrc, "vaGetImage");
> > > > #endif
> > > > 
> > > > 	vrc = vaMapBuffer(ctx->display, va_image.buf, &data);
> > > > 	CHECK_VASTATUS(vrc, "vaMapBuffer");
> > > > 
> > > > 	memcpy(f->img[0], data + va_image.offsets[0],
> > > > 		va_image.pitches[0] * cctx->coded_height);
> > > > 	memcpy(f->img[1], data + va_image.offsets[1],
> > > > 		va_image.pitches[1] * cctx->coded_height / 2);
> > > > 
> > > > 	vrc = vaUnmapBuffer(ctx->display, va_image.buf);
> > > > 	CHECK_VASTATUS(vrc, "vaUnmapBuffer");
> > > > 
> > > > #ifdef USE_VADERIVEIMAGE
> > > > 	vrc = vaDestroyImage(ctx->display, va_image.image_id);
> > > > 	CHECK_VASTATUS(vrc, "vaDestroyImage");
> > > > #endif
> > > > 
> > > > Results in a higher cpu consumption if compiled with
> > > > USE_VADERIVEIMAGE. Is this normal, or is there something I am doing
> > > > wrong? I can provide the complete code if needed.
> > > 
> > > It depends on the the underlying memory format. Most surfaces used in
> > > the driver are tiled, so the derived images are tiled too, the memory
> > > returned is uncached and reading data from it would be slow. If the
> > > image isn't tiled, the returned memory is cached. 
> > 
> > Ok. Thanks for the explanation.
> > 
> > Is the result of vaMapBuffer always uncached, or only for
> > a VA image obtained with vaDeriveImage ? 
> 
> You could try the following patch if you want to the result of
> vaMapBuffer() on an Image is uncached.
> http://lists.freedesktop.org/archives/libva/attachments/20140617/d9cc4b3c/attachment.bin

The patch applies to the 1.5.0 release with some offset.
With this patch applied, I get the same (bad) performances with or
without using vaDeriveImage.

But after some tests, I have found that the solution which avoids
the most copies for display on opengl is to use vaCopySurfaceGLX.

Unfortunately, it is is specific to GLX, while it works fine with
the i965 diver, it does not seem to work with the vdpau base va
driver and I have read yesterday that EGL is "the new thing", so I
am going to look into EGL. If someone has some example code for
running EGL on Linux with the XOrg server, I am interested.

-- 
					    Gilles.


More information about the Libva mailing list