[Intel-gfx] [PATCH] drm/i915/icl: Fix context slice count configuration
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Wed Aug 29 12:11:56 UTC 2018
On 29/08/2018 13:09, Tvrtko Ursulin wrote:
>
> On 29/08/2018 13:02, Tvrtko Ursulin wrote:
>>
>> On 29/08/2018 12:07, Lionel Landwerlin wrote:
>>> On 29/08/2018 11:54, Tvrtko Ursulin wrote:
>>>>
>>>> On 22/08/2018 17:33, Lionel Landwerlin wrote:
>>>>> On 22/08/2018 17:18, Tvrtko Ursulin wrote:
>>>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>>
>>>>>> Bitfield width for configuring the active slice count has grown in
>>>>>> Gen11
>>>>>> so we need to program the GEN8_R_PWR_CLK_STATE accordingly.
>>>>>>
>>>>>> Current code was always requesting eight times the number of
>>>>>> slices (due
>>>>>> writting to a bitfield starting three bits higher than it should).
>>>>>> These
>>>>>> requests were luckily a) capped by the hardware to the available
>>>>>> number of
>>>>>> slices, and b) we haven't yet exported the code to ask for reduced
>>>>>> slice
>>>>>> configurations.
>>>>>>
>>>>>> Due both of the above there was no impact from this incorrect
>>>>>> programming
>>>>>> but we should still fix it.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>>> Bspec: 12247
>>>>>> Reported-by: tony.ye at intel.com
>>>>>> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>>> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>>>>> Cc: tony.ye at intel.com
>>>>>> ---
>>>>>> drivers/gpu/drm/i915/i915_reg.h | 2 ++
>>>>>> drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++----
>>>>>> 2 files changed, 10 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>>>>>> b/drivers/gpu/drm/i915/i915_reg.h
>>>>>> index 59d06d0055bb..640f7b774a26 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>>>>> @@ -344,6 +344,8 @@ static inline bool
>>>>>> i915_mmio_reg_valid(i915_reg_t reg)
>>>>>> #define GEN8_RPCS_S_CNT_ENABLE (1 << 18)
>>>>>> #define GEN8_RPCS_S_CNT_SHIFT 15
>>>>>> #define GEN8_RPCS_S_CNT_MASK (0x7 <<
>>>>>> GEN8_RPCS_S_CNT_SHIFT)
>>>>>> +#define GEN11_RPCS_S_CNT_SHIFT 12
>>>>>> +#define GEN11_RPCS_S_CNT_MASK (0x3f <<
>>>>>> GEN11_RPCS_S_CNT_SHIFT)
>>>>>> #define GEN8_RPCS_SS_CNT_ENABLE (1 << 11)
>>>>>> #define GEN8_RPCS_SS_CNT_SHIFT 8
>>>>>> #define GEN8_RPCS_SS_CNT_MASK (0x7 <<
>>>>>> GEN8_RPCS_SS_CNT_SHIFT)
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>>>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>>>>> index 36050f085071..43b8b0675ba0 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>>>> @@ -2501,10 +2501,14 @@ make_rpcs(struct drm_i915_private *dev_priv)
>>>>>> * enablement.
>>>>>> */
>>>>>> if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
>>>>>> - rpcs |= GEN8_RPCS_S_CNT_ENABLE;
>>>>>> - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) <<
>>>>>> - GEN8_RPCS_S_CNT_SHIFT;
>>>>>> - rpcs |= GEN8_RPCS_ENABLE;
>>>>>> + rpcs = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
>>>>>> +
>>>>>> + if (INTEL_GEN(dev_priv) >= 11)
>>>>>> + rpcs <<= GEN11_RPCS_S_CNT_SHIFT;
>>>>>> + else
>>>>>> + rpcs <<= GEN8_RPCS_S_CNT_SHIFT;
>>>>>> +
>>>>>> + rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE;
>>>>>
>>>>>
>>>>> I don't know if you saw that wording in the documentation :
>>>>>
>>>>> "
>>>>>
>>>>> Note: In ICL, software programs this register as if GT consists of
>>>>> 2 slices with 4 subslices in each slice. Hardware maps this to the
>>>>> 1 slice/8-subslice physical layout.
>>>>>
>>>>> "
>>>>>
>>>>>
>>>>> My understanding is that it would make this function a bit more
>>>>> complicated ;)
>>>>>
>>>>> It also implies that on ICL you cannot select 3 subslices, which is
>>>>> unfortunately what Tony was trying to do.
>>>>>
>>>>> Maybe some opens need to be raised as to what's possible on ICL.
>>>>
>>>> Happy to r-b this one since we clarified it is fine?
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>>>
>>>>
>>> I think there is still an issue here with regard to the subslice
>>> programming.
>>> You can only program values in [1, 4].
>>> But because we expose up to 8 subslices, you need to take that mask
>>> and alter the slice mask based on its value.
>>
>> Hmm true, thanks. I think we need to somehow express this restriction
>> on the uAPI level rather than silently mask it. Shall we reject
>> attempts to configure subslice mask if hweight(mask) > max_subslices /
>> 2 for ICL-LP?
>
> No wait, I was in the dynamic sseu world, but this also applies to the
> current code base..
>
> When we program SScount in drm-tip with the hweight of 8 we overflow the
> bitfield already and splat over SScountEn. Luckily with no ill effect
> since we set it anyway.
>
> So the fix needs to be in make_rpcs today..
>
> if (GEN11_LP) {
> if (hweight(slice_mask) == 1)
&& hweight(subslice_mask) < max_subslices
So to leave the default full config alone.
> subslice_count /= 2;
> else
> subslice_en = 0;
> }
>
> I think..
>
> Regards,
>
> Tvrtko
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
More information about the Intel-gfx
mailing list