[Mesa-dev] [PATCH 0/8] Gallium & RadeonSI optimization for Ryzen CPUs
Marek Olšák
maraeo at gmail.com
Fri Sep 7 19:37:10 UTC 2018
I'm changing the initial L3 cache number to this:
+static unsigned L3_cache_number;
+static once_flag init_cache_number_flag = ONCE_FLAG_INIT;
+
+static void
+util_init_cache_number(void)
+{
+ /* Get a semi-random number. */
+ int64_t t = os_time_get_nano();
+ L3_cache_number = (t ^ (t >> 8) ^ (t >> 16));
+}
+
Then the next unassigned CPU thread group will be set to:
+ call_once(&init_cache_number_flag, util_init_cache_number);
+ cache = p_atomic_inc_return(&L3_cache_number) % num_L3_caches;
Marek
On Fri, Sep 7, 2018 at 3:01 PM, Marek Olšák <maraeo at gmail.com> wrote:
> On Fri, Sep 7, 2018 at 11:04 AM, Michel Dänzer <michel at daenzer.net> wrote:
>> On 2018-09-07 4:31 p.m., Marek Olšák wrote:
>>> On Fri, Sep 7, 2018, 4:34 AM Michel Dänzer <michel at daenzer.net> wrote:
>>>> On 2018-09-06 10:56 p.m., Axel Davy wrote:
>>>>
>>>>> I fear if we begin to do the work manually, there won't be interest to
>>>>> do that in the kernel,
>>>>> and thus all applications will need to include such core pinning code to
>>>>> have good performance when
>>>>> multithreaded.
>>>>
>>>> I'm also a bit worried that this solution could result in multiple
>>>> processes contending for the same set of CPU cores, while other cores
>>>> might be underused, which could result in worse overall system performance.
>>>
>>> Any suggestion how to choose the ccx such that processes end up on a
>>> different one?
>>
>> One thing you could do is use a random initial offset. That should at
>> least avoid e.g. most applications using the same toolkit (which may do
>> OpenGL calls, even if the application itself doesn't) choosing the same one.
>
> I'll update the helper function to choose the initial CCX with
> (os_time_get_nano() % num_L3_caches). That should be random.
>
>>
>>
>>> I don't think the performance can be worse than it is right now.
>>
>> In the worst case, all processes using OpenGL (or at least their OpenGL
>> related threads, but that usually includes the main thread) could end up
>> restricted to the same 4 cores, leaving up to 28 cores underused.
>
> 4C/4T used to be a standard and certainly enough for gaming. 4C/8T
> used to be luxury before Ryzen, which is now the CCX. We should be
> fine with 4 cores.
>
> Marek
More information about the mesa-dev
mailing list