[Beignet] Combine Loads from __constant space

Song, Ruiling ruiling.song at intel.com
Tue Nov 25 19:04:03 PST 2014


I am not an expert on the Cache related thing, basically constant cache is part of the Read-only cache lies in L3.
From the code in src/intel/intel_gpgpu.c, below logic is for IvyBridge:
If (slmMode)
allocate 64KB constant cache
Else
         Allocate 32KB constant cache

I am not sure is there any big performance difference between less than or greater than the real constant cache size in L3.
I simply wrote a random-selected number 512KB as the up limit in driver API.
But it did deserve to investigate the performance change according to used constant size.
If we use too much constant  larger than the constant cache allocated from L3,
I think it will definitely cause constant cache data swap in-out frequently. Right?
If you would like to contribute any performance test to beignet, or any other open source test suite, it would be really appreciated!

Thanks!
Ruiling
From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of Tony Moore
Sent: Wednesday, November 26, 2014 6:45 AM
To: beignet at lists.freedesktop.org
Subject: Re: [Beignet] Combine Loads from __constant space

Another question I had about __constant, was there seems to be no limit. I'm using __constant for every read-only parameter now totalling 1500Kb and this test now runs in 32ms. So, is there a limit? Is this method reliable? Can driver do this implicitly on all read-only buffers?
thanks

On Tue Nov 25 2014 at 2:11:26 PM Tony Moore <tonywmnix at gmail.com<mailto:tonywmnix at gmail.com>> wrote:
Hello,
I notice that reads are not being combined when I use __constant on a read-only kernel buffer. Is this something that can be improved?

In my kernel there are many loads from a read-only data structure. When I use the __global specifier for the memory space I see a total of 33 send instructions and a runtime of 81ms. When I use the __constant specifier, I see 43 send instructions and a runtime of 40ms. I'm hoping that combining the loads could improve performance further.

thanks!
tony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20141126/99c42d75/attachment.html>


More information about the Beignet mailing list