[PATCH v4 00/15] CCS static load balance

Mehmood, Arshad arshad.mehmood at intel.com
Thu Mar 27 15:10:46 UTC 2025


I’d like to provide additional context regarding the necessity of these patches.
The shift from dynamic load balancing mode to fixed mode, with CCS usage restricted to a single unit, has led to a notable performance regression, with workloads experiencing an approximately 10% FPS drop.

For example, on DG2, the ResNet-50 inference benchmark previously achieved ~10,500 FPS in dynamic load balancing mode. However, after limiting CCS to 1 in fixed mode, performance dropped to ~9,200 FPS. With these patches, enabling all 4 CCS units via sysfs (in fixed mode) restores performance back to nearly 10,500 FPS, effectively matching the previous dynamic mode results.

Given customer expectations to maintain prior performance levels, these patches are essential to ensuring workloads utilizing multiple CCS units do not experience unnecessary degradation. The proposed sysfs interface provides configurability, allowing controlled re-enablement of all 4 CCS units while keeping fixed mode intact. Since fixed mode is now in use, having a configurable approach ensures flexibility to address different scenarios that may arise.

Let me know if you need further details.

Regards,
Arshad
________________________________
From: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
Sent: Thursday, March 27, 2025 2:49 PM
To: Andi Shyti <andi.shyti at linux.intel.com>
Cc: Andi Shyti <andi.shyti at linux.intel.com>; dri-devel <dri-devel at lists.freedesktop.org>; intel-gfx <intel-gfx at lists.freedesktop.org>; Tvrtko Ursulin <tursulin at ursulin.net>; Chris Wilson <chris.p.wilson at linux.intel.com>; Simona Vetter <simona.vetter at ffwll.ch>; Mehmood, Arshad <arshad.mehmood at intel.com>; Mrozek, Michal <michal.mrozek at intel.com>; Andi Shyti <andi.shyti at kernel.org>; Ayyalasomayajula, Usharani <usharani.ayyalasomayajula at intel.com>
Subject: Re: [PATCH v4 00/15] CCS static load balance

Quoting Andi Shyti (2025-03-25 12:52:58)
> On Tue, Mar 25, 2025 at 10:24:42AM +0200, Joonas Lahtinen wrote:

<SNIP>

> > Do you have a reference to some GitLab issues or maybe some external
> > project issues where regressions around here are discussed?
>
> AFAIK, there's no GitLab issue for this because we're not fixing
> a bug here; we're adding a new sysfs interface.

This sysfs interface was exactly designed to address performance
regressions coming from limiting the number of CCS to 1.

So unless we have a specific workload and end-user reporting a
regression on it, there's no incentive to spend any further time here.

<SNIP>

> Arshad and Usha can definitely help if there are any technical
> questions about how the application uses the interface.

I don't have any technical questions as I specified the interface
initially :)

This is not about technical opens about how the interface works.
To recap, when we initially implemented the 1CCS mode, we got active
feedback on the community on regressions.

We were careful to verify that all userspace would cleanly fall back to
using 1CCS mode after it was implemented. And indeed, nobody has been
asking for the 4CCS mode back after the 1CCS mode bugs were fixed.

So as far as I see it, there are no users for this interface in
upstream, and thus we should not spend the time on it.

Regards, Joonas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20250327/d9f18379/attachment.htm>


More information about the Intel-gfx mailing list