[Intel-gfx] [PATCH] drm/i915/icl: Fix context slice count configuration
Lionel Landwerlin
lionel.g.landwerlin at intel.com
Wed Aug 22 17:21:31 UTC 2018
On 22/08/2018 18:07, Tvrtko Ursulin wrote:
>
> On 22/08/2018 17:33, Lionel Landwerlin wrote:
>> On 22/08/2018 17:18, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> Bitfield width for configuring the active slice count has grown in
>>> Gen11
>>> so we need to program the GEN8_R_PWR_CLK_STATE accordingly.
>>>
>>> Current code was always requesting eight times the number of slices
>>> (due
>>> writting to a bitfield starting three bits higher than it should).
>>> These
>>> requests were luckily a) capped by the hardware to the available
>>> number of
>>> slices, and b) we haven't yet exported the code to ask for reduced
>>> slice
>>> configurations.
>>>
>>> Due both of the above there was no impact from this incorrect
>>> programming
>>> but we should still fix it.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Bspec: 12247
>>> Reported-by: tony.ye at intel.com
>>> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>> Cc: tony.ye at intel.com
>>> ---
>>> drivers/gpu/drm/i915/i915_reg.h | 2 ++
>>> drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++----
>>> 2 files changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index 59d06d0055bb..640f7b774a26 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -344,6 +344,8 @@ static inline bool
>>> i915_mmio_reg_valid(i915_reg_t reg)
>>> #define GEN8_RPCS_S_CNT_ENABLE (1 << 18)
>>> #define GEN8_RPCS_S_CNT_SHIFT 15
>>> #define GEN8_RPCS_S_CNT_MASK (0x7 << GEN8_RPCS_S_CNT_SHIFT)
>>> +#define GEN11_RPCS_S_CNT_SHIFT 12
>>> +#define GEN11_RPCS_S_CNT_MASK (0x3f <<
>>> GEN11_RPCS_S_CNT_SHIFT)
>>> #define GEN8_RPCS_SS_CNT_ENABLE (1 << 11)
>>> #define GEN8_RPCS_SS_CNT_SHIFT 8
>>> #define GEN8_RPCS_SS_CNT_MASK (0x7 <<
>>> GEN8_RPCS_SS_CNT_SHIFT)
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 36050f085071..43b8b0675ba0 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -2501,10 +2501,14 @@ make_rpcs(struct drm_i915_private *dev_priv)
>>> * enablement.
>>> */
>>> if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
>>> - rpcs |= GEN8_RPCS_S_CNT_ENABLE;
>>> - rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) <<
>>> - GEN8_RPCS_S_CNT_SHIFT;
>>> - rpcs |= GEN8_RPCS_ENABLE;
>>> + rpcs = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
>>> +
>>> + if (INTEL_GEN(dev_priv) >= 11)
>>> + rpcs <<= GEN11_RPCS_S_CNT_SHIFT;
>>> + else
>>> + rpcs <<= GEN8_RPCS_S_CNT_SHIFT;
>>> +
>>> + rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE;
>>
>>
>> I don't know if you saw that wording in the documentation :
>>
>> "
>>
>> Note: In ICL, software programs this register as if GT consists of 2
>> slices with 4 subslices in each slice. Hardware maps this to the 1
>> slice/8-subslice physical layout.
>>
>> "
>>
>>
>> My understanding is that it would make this function a bit more
>> complicated ;)
>>
>> It also implies that on ICL you cannot select 3 subslices, which is
>> unfortunately what Tony was trying to do.
>>
>> Maybe some opens need to be raised as to what's possible on ICL.
>
> I interpreted the note in my head as "In ICL, _if_ _the_ software
> programs.." so did not see a problem. Thought that would be just some
> hidden/under the covers remapping hw would do. But I see now that was
> wrong, and you are most likely right. I'll try to do some digging to
> understand this better.
>
> But for the second part of it, I don't see why 1x3 configuration would
> be illegal. If software must assume hw is 2x4, even if in reality it
> is 1x8, then 1x3 would still be legal, no?
We still seem to put a subslice number in the register for ICL (values
being 0b001, 0b010, 0b011 & 0b100).
My understanding is that the hardware will just multiply that value by 2
to map to the 1x8 underlying topology.
So if that's the case, you can't really do odd numbers... ¯\_(ツ)_/¯
>
> I thought the cause of the hang on ICL was that when Tony was asking
> for 1x3, the code was actually programming a request for 8x3 - which
> is illegal (as in slice count must be 1 to enable subslice pg) and so
> would fail to actually turn of the unwanted subslices.
>
> But then I also though on ICL we deal with masks and not counts when
> programming the hardware. Since apparently it is counts both for
> slices and subslices, I am mightily confused as to how media-driver
> would even theoretically be able to turn off a _specific_ (sub)slice?!
The fact that the feature needed isn't implemented at by the thread
dispatcher is really strange to me too.
It sounds like we're forced to use a bigger hammer than what we really need.
As to how that maps to the right subslices is also unknown to me.
The only explanation I have is that subslices with no VME samplers get
turn off first in the list of subslices to turn off.
Cheers,
-
Lionel
>
> Regards,
>
> Tvrtko
>
>>
>>> }
>>> if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
>>
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20180822/210fdd7e/attachment.html>
More information about the Intel-gfx
mailing list