[PATCH v4 0/2] Support/debug for slow GuC loads
John Harrison
john.c.harrison at intel.com
Tue Apr 9 00:09:11 UTC 2024
On 4/4/2024 11:25, Lucas De Marchi wrote:
> On Tue, Feb 27, 2024 at 05:09:54PM -0800, John.C.Harrison at Intel.com
> wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> Sometimes the GuC load is slower that it should be. For end users,
>> that usually means some kind of thermal throttling issue. Internally,
>> there can be any number of bugs that cause it. So don't completely
>> fail to load, just cope with it and report the problem.
>>
>> v2: Revert include order (review feedback from Lucas)
>> v3: Remove '_sysfs' from throttle file names and keep limit query in
>> the same file rather than moving elsewhere (review feedback from
>> Rodrigo). Fix the reporting of requested vs granted frequencies
>> (review feedback from Badal).
>> v4: Manually code the loop timeout/condition checking because helper
>> functions are not allowed (review feedback from Lucas/Rodrigo)
>
> wrong reason. It's not that helper functions are not allowed. Rather
> *this* particular helper was considered bad and counter productive.
>
> For similar reasons as e.g. Linus commented recently on bcachefs moving
> some functions to be shared:
>
> https://lore.kernel.org/all/CAHk-=wg3djFJMeN3L_zx3P-6eN978Y1JTssxy81RhAbxB==L8Q@mail.gmail.com/
>
Not seeing how this compares. Linus' complaint is about some algorithmic
decisions that he disagrees with. It sounds like quite a large chunk of
code that is doing fundamentally wrong (or at least unnecessary) things.
Whereas this is simply abstracting timeout functionality for a generic
wait. I have no problems with wanting to have a more specific helper for
99% of use cases that are a specific but common pattern. But for those
few cases that do not fit that specific pattern, having a more generic
wait helper is hardly creating 'disgusting and completely nonsensical
interfaces'. Certainly the comment 'But the main dealbreaker is the
insane math.' does not apply to a simple wait helper.
>
> We'd need to spend much more time cleaning it up and making it a good
> interface rather than copying what we have in i915 and stuffing it in a
Not exactly sure what needs large amounts of time to clean up? It would
simply be the existing xe_mmio_wait32 function but with the "read =
xe_mmio_read(reg); if(read == val) break;" replaced with a callback.
Indeed the xe_mmio_wait32 function itself would just be a wrapper around
the generic wait helper that passes in the read/if as the callback.
Everything else is identical to what we already have and apparently
consider clean and a good interface.
Apart from the atomic part. Which is apparently hideous and broken
according to earlier comments. But still made it in to the Xe re-write
anyway. And that is the underlying wait helper part, not related to any
interfaces around the test itself.
> *utils.[hc]. In the past it turned out there were not real good reasons
> for abstracting it and making it generic for all the contexts the caller
> may be on.
That is a failing of the usage not the helper.
With great power...
John.
>
> Lucas De Marchi
>
>>
>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>
>>
>> John Harrison (2):
>> drm/xe: Make read_perf_limit_reasons globally accessible
>> drm/xe/guc: Port over the slow GuC loading support from i915
>>
>> drivers/gpu/drm/xe/Makefile | 2 +-
>> drivers/gpu/drm/xe/abi/guc_errors_abi.h | 26 +-
>> drivers/gpu/drm/xe/regs/xe_guc_regs.h | 2 +
>> drivers/gpu/drm/xe/xe_gt_freq.c | 4 +-
>> ...e_gt_throttle_sysfs.c => xe_gt_throttle.c} | 26 +-
>> drivers/gpu/drm/xe/xe_gt_throttle.h | 17 ++
>> drivers/gpu/drm/xe/xe_gt_throttle_sysfs.h | 16 --
>> drivers/gpu/drm/xe/xe_guc.c | 226 ++++++++++++++----
>> drivers/gpu/drm/xe/xe_mmio.c | 61 +++++
>> drivers/gpu/drm/xe/xe_mmio.h | 2 +
>> 10 files changed, 307 insertions(+), 75 deletions(-)
>> rename drivers/gpu/drm/xe/{xe_gt_throttle_sysfs.c =>
>> xe_gt_throttle.c} (86%)
>> create mode 100644 drivers/gpu/drm/xe/xe_gt_throttle.h
>> delete mode 100644 drivers/gpu/drm/xe/xe_gt_throttle_sysfs.h
>>
>> --
>> 2.43.0
>>
More information about the Intel-xe
mailing list