[Intel-gfx] gem clflush optimization for media encoding

Jesse Barnes jbarnes at virtuousgeek.org
Thu Jun 23 19:20:28 CEST 2011


On Wed, 22 Jun 2011 12:29:21 +0800
"Zou, Nanhai" <nanhai.zou at intel.com> wrote:
> 	map_gtt in current gem is super slow. 
> 	I've tried map_gtt but it seems that the speed is unacceptable.
> 
> >>> 	Since it is CPU read only surface, clflush in not needed at all.
> >>
> >>You'd still have to invalidate cache lines using clflush to avoid using
> >>stale data in the CPU cache.
> >>
> >>--
>   Yes, you are right, in this case clflush is still needed to invalidate the CPU cache. 
> 
>   The problem is that we do not now how large the coded output buffer is before we do the encoding.
>   So we have to allocate a large enough gem object before encoding, in most
> case the encoding result will be less than 1/10 of the safe buffer size, 9/10 of the buffer was unnecessarily clflushed.
> 
>   A fast map_gtt implementation could be the best choice here.

What's slow about it?  Are you sure you're getting a WC mapping?  If
your MTRRs or PAT are messed up you may be getting a regular UC
mapping, which would be slow.  Also you need to write the data
sequentially to get the benefits of WC.  If you write every other byte
or jump around (and of course read) you'll flush the WC buffer and slow
things down.

-- 
Jesse Barnes, Intel Open Source Technology Center



More information about the Intel-gfx mailing list