[Intel-gfx] gem clflush optimization for media encoding

Zou, Nanhai nanhai.zou at intel.com
Fri Jun 24 03:41:42 CEST 2011



>>-----Original Message-----
>>From: Jesse Barnes [mailto:jbarnes at virtuousgeek.org]
>>Sent: 2011年6月24日 1:20
>>To: Zou, Nanhai
>>Cc: Keith Packard; intel-gfx at lists.freedesktop.org; Anholt, Eric
>>Subject: Re: [Intel-gfx] gem clflush optimization for media encoding
>>
>>On Wed, 22 Jun 2011 12:29:21 +0800
>>"Zou, Nanhai" <nanhai.zou at intel.com> wrote:
>>> 	map_gtt in current gem is super slow.
>>> 	I've tried map_gtt but it seems that the speed is unacceptable.
>>>
>>> >>> 	Since it is CPU read only surface, clflush in not needed at all.
>>> >>
>>> >>You'd still have to invalidate cache lines using clflush to avoid using
>>> >>stale data in the CPU cache.
>>> >>
>>> >>--
>>>   Yes, you are right, in this case clflush is still needed to invalidate the
>>CPU cache.
>>>
>>>   The problem is that we do not now how large the coded output buffer is before
>>we do the encoding.
>>>   So we have to allocate a large enough gem object before encoding, in most
>>> case the encoding result will be less than 1/10 of the safe buffer size, 9/10
>>of the buffer was unnecessarily clflushed.
>>>
>>>   A fast map_gtt implementation could be the best choice here.
>>
>>What's slow about it?  Are you sure you're getting a WC mapping?  If
>>your MTRRs or PAT are messed up you may be getting a regular UC
>>mapping, which would be slow.  Also you need to write the data
>>sequentially to get the benefits of WC.  If you write every other byte
>>or jump around (and of course read) you'll flush the WC buffer and slow
>>things down.
>>

Yes, I have noticed that, seems that the uploaded data was written through uc mapping.
We are trying to fix this. 

Thanks
Zou Nanhai
>>--
>>Jesse Barnes, Intel Open Source Technology Center


More information about the Intel-gfx mailing list