[Intel-gfx] gem clflush optimization for media encoding
Zou, Nanhai
nanhai.zou at intel.com
Fri Jun 24 03:41:42 CEST 2011
>>-----Original Message-----
>>From: Jesse Barnes [mailto:jbarnes at virtuousgeek.org]
>>Sent: 2011年6月24日 1:20
>>To: Zou, Nanhai
>>Cc: Keith Packard; intel-gfx at lists.freedesktop.org; Anholt, Eric
>>Subject: Re: [Intel-gfx] gem clflush optimization for media encoding
>>
>>On Wed, 22 Jun 2011 12:29:21 +0800
>>"Zou, Nanhai" <nanhai.zou at intel.com> wrote:
>>> map_gtt in current gem is super slow.
>>> I've tried map_gtt but it seems that the speed is unacceptable.
>>>
>>> >>> Since it is CPU read only surface, clflush in not needed at all.
>>> >>
>>> >>You'd still have to invalidate cache lines using clflush to avoid using
>>> >>stale data in the CPU cache.
>>> >>
>>> >>--
>>> Yes, you are right, in this case clflush is still needed to invalidate the
>>CPU cache.
>>>
>>> The problem is that we do not now how large the coded output buffer is before
>>we do the encoding.
>>> So we have to allocate a large enough gem object before encoding, in most
>>> case the encoding result will be less than 1/10 of the safe buffer size, 9/10
>>of the buffer was unnecessarily clflushed.
>>>
>>> A fast map_gtt implementation could be the best choice here.
>>
>>What's slow about it? Are you sure you're getting a WC mapping? If
>>your MTRRs or PAT are messed up you may be getting a regular UC
>>mapping, which would be slow. Also you need to write the data
>>sequentially to get the benefits of WC. If you write every other byte
>>or jump around (and of course read) you'll flush the WC buffer and slow
>>things down.
>>
Yes, I have noticed that, seems that the uploaded data was written through uc mapping.
We are trying to fix this.
Thanks
Zou Nanhai
>>--
>>Jesse Barnes, Intel Open Source Technology Center
More information about the Intel-gfx
mailing list