[Intel-gfx] [PATCH 4/7] drm/i915/perf: lock powergating configuration to default when active
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Thu Sep 6 10:36:29 UTC 2018
On 06/09/2018 11:22, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2018-09-06 11:18:01)
>> On 06/09/2018 11:10, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2018-09-06 10:57:47)
>>>> On 05/09/2018 15:22, Tvrtko Ursulin wrote:
>>>>> From: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>>
>>>>> If some of the contexts submitting workloads to the GPU have been
>>>>> configured to shutdown slices/subslices, we might loose the NOA
>>>>> configurations written in the NOA muxes.
>>>>>
>>>>> One possible solution to this problem is to reprogram the NOA muxes
>>>>> when we switch to a new context. We initially tried this in the
>>>>> workaround batchbuffer but some concerns where raised about the cost
>>>>> of reprogramming at every context switch. This solution is also not
>>>>> without consequences from the userspace point of view. Reprogramming
>>>>> of the muxes can only happen once the powergating configuration has
>>>>> changed (which happens after context switch). This means for a window
>>>>> of time during the recording, counters recorded by the OA unit might
>>>>> be invalid. This requires userspace dealing with OA reports to discard
>>>>> the invalid values.
>>>>>
>>>>> Minimizing the reprogramming could be implemented by tracking of the
>>>>> last programmed configuration somewhere in GGTT and use MI_PREDICATE
>>>>> to discard some of the programming commands, but the command streamer
>>>>> would still have to parse all the MI_LRI instructions in the
>>>>> workaround batchbuffer.
>>>>>
>>>>> Another solution, which this change implements, is to simply disregard
>>>>> the user requested configuration for the period of time when i915/perf
>>>>> is active. There is no known issue with this apart from a performance
>>>>> penality for some media workloads that benefit from running on a
>>>>> partially powergated GPU. We already prevent RC6 from affecting the
>>>>> programming so it doesn't sound completely unreasonable to hold on
>>>>> powergating for the same reason.
>>>>>
>>>>> v2: Leave RPCS programming in intel_lrc.c (Lionel)
>>>>>
>>>>> v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel)
>>>>> More to_intel_context() (Tvrtko)
>>>>> s/dev_priv/i915/ (Tvrtko)
>>>>>
>>>>> Tvrtko Ursulin:
>>>>>
>>>>> v4:
>>>>> * Rebase for make_rpcs changes.
>>>>>
>>>>> v5:
>>>>> * Apply OA restriction from make_rpcs directly.
>>>>>
>>>>> v6:
>>>>> * Rebase for context image setup changes.
>>>>>
>>>>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_perf.c | 5 +++++
>>>>> drivers/gpu/drm/i915/intel_lrc.c | 30 ++++++++++++++++++++----------
>>>>> drivers/gpu/drm/i915/intel_lrc.h | 3 +++
>>>>> 3 files changed, 28 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>>>>> index ccb20230df2c..dd65b72bddd4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>>>> @@ -1677,6 +1677,11 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
>>>>>
>>>>> CTX_REG(reg_state, state_offset, flex_regs[i], value);
>>>>> }
>>>>> +
>>>>> + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
>>>>> + gen8_make_rpcs(dev_priv,
>>>>> + &to_intel_context(ctx,
>>>>> + dev_priv->engine[RCS])->sseu));
>>>> I think there is one issue I missed on the previous iterations of this
>>>> patch.
>>>>
>>>> This gen8_update_reg_state_unlocked() is called when the GPU is parked
>>>> on the kernel context.
>>>>
>>>> It's supposed to update all contexts, but I think we might not be able
>>>> to update the kernel context image while the GPU is using it.
>>> The kernel context is only ever taken in extremis (you are either
>>> parking or stalling userspace) so I don't care.
>>
>> The patch exposing the RPCS configuration to userspace will make use of
>> the kernel context while OA/perf is enabled. Even if it reprograms the
>> locked value that will break the power configuration stability on Gen11
>> (because the locked configuration will be different from the kernel
>> context configuration).
> Sure, but as you point out that's only on changing configuration.
>
> What's missing in the patch is that we only bail early if the new sseu
> matches the ce->sseu, but that doesn't necessarily match whats in the
> context due to OA. (Or maybe I missed the conversion to rpcs value and
> checking.)
> -Chris
>
Yep, because the gen8_make_rpcs() post processes the values store at the
gem context level, we risk rerunning the kernel context to write the
exiting value.
Sorry this is all so messy :(
-
Lionel
More information about the Intel-gfx
mailing list