[Intel-gfx] [PATCH] drm/i915/icl: Fix context slice count configuration

Lionel Landwerlin lionel.g.landwerlin at intel.com
Wed Aug 22 17:21:31 UTC 2018


On 22/08/2018 18:07, Tvrtko Ursulin wrote:
>
> On 22/08/2018 17:33, Lionel Landwerlin wrote:
>> On 22/08/2018 17:18, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>
>>> Bitfield width for configuring the active slice count has grown in 
>>> Gen11
>>> so we need to program the GEN8_R_PWR_CLK_STATE accordingly.
>>>
>>> Current code was always requesting eight times the number of slices 
>>> (due
>>> writting to a bitfield starting three bits higher than it should). 
>>> These
>>> requests were luckily a) capped by the hardware to the available 
>>> number of
>>> slices, and b) we haven't yet exported the code to ask for reduced 
>>> slice
>>> configurations.
>>>
>>> Due both of the above there was no impact from this incorrect 
>>> programming
>>> but we should still fix it.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Bspec: 12247
>>> Reported-by: tony.ye at intel.com
>>> Suggested-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>> Cc: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>>> Cc: tony.ye at intel.com
>>> ---
>>>   drivers/gpu/drm/i915/i915_reg.h  |  2 ++
>>>   drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++----
>>>   2 files changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
>>> b/drivers/gpu/drm/i915/i915_reg.h
>>> index 59d06d0055bb..640f7b774a26 100644
>>> --- a/drivers/gpu/drm/i915/i915_reg.h
>>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>>> @@ -344,6 +344,8 @@ static inline bool 
>>> i915_mmio_reg_valid(i915_reg_t reg)
>>>   #define   GEN8_RPCS_S_CNT_ENABLE    (1 << 18)
>>>   #define   GEN8_RPCS_S_CNT_SHIFT        15
>>>   #define   GEN8_RPCS_S_CNT_MASK        (0x7 << GEN8_RPCS_S_CNT_SHIFT)
>>> +#define   GEN11_RPCS_S_CNT_SHIFT    12
>>> +#define   GEN11_RPCS_S_CNT_MASK        (0x3f << 
>>> GEN11_RPCS_S_CNT_SHIFT)
>>>   #define   GEN8_RPCS_SS_CNT_ENABLE    (1 << 11)
>>>   #define   GEN8_RPCS_SS_CNT_SHIFT    8
>>>   #define   GEN8_RPCS_SS_CNT_MASK        (0x7 << 
>>> GEN8_RPCS_SS_CNT_SHIFT)
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 36050f085071..43b8b0675ba0 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -2501,10 +2501,14 @@ make_rpcs(struct drm_i915_private *dev_priv)
>>>        * enablement.
>>>       */
>>>       if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
>>> -        rpcs |= GEN8_RPCS_S_CNT_ENABLE;
>>> -        rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) <<
>>> -            GEN8_RPCS_S_CNT_SHIFT;
>>> -        rpcs |= GEN8_RPCS_ENABLE;
>>> +        rpcs = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
>>> +
>>> +        if (INTEL_GEN(dev_priv) >= 11)
>>> +            rpcs <<= GEN11_RPCS_S_CNT_SHIFT;
>>> +        else
>>> +            rpcs <<= GEN8_RPCS_S_CNT_SHIFT;
>>> +
>>> +        rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_S_CNT_ENABLE;
>>
>>
>> I don't know if you saw that wording in the documentation :
>>
>>   "
>>
>> Note: In ICL, software programs this register as if GT consists of 2 
>> slices with 4 subslices in each slice. Hardware maps this to the 1 
>> slice/8-subslice physical layout.
>>
>> "
>>
>>
>> My understanding is that it would make this function a bit more 
>> complicated ;)
>>
>> It also implies that on ICL you cannot select 3 subslices, which is 
>> unfortunately what Tony was trying to do.
>>
>> Maybe some opens need to be raised as to what's possible on ICL.
>
> I interpreted the note in my head as "In ICL, _if_ _the_ software 
> programs.." so did not see a problem. Thought that would be just some 
> hidden/under the covers remapping hw would do. But I see now that was 
> wrong, and you are most likely right. I'll try to do some digging to 
> understand this better.
>
> But for the second part of it, I don't see why 1x3 configuration would 
> be illegal. If software must assume hw is 2x4, even if in reality it 
> is 1x8, then 1x3 would still be legal, no?


We still seem to put a subslice number in the register for ICL (values 
being 0b001, 0b010, 0b011 & 0b100).

My understanding is that the hardware will just multiply that value by 2 
to map to the 1x8 underlying topology.

So if that's the case, you can't really do odd numbers... ¯\_(ツ)_/¯


>
> I thought the cause of the hang on ICL was that when Tony was asking 
> for 1x3, the code was actually programming a request for 8x3 - which 
> is illegal (as in slice count must be 1 to enable subslice pg) and so 
> would fail to actually turn of the unwanted subslices.
>
> But then I also though on ICL we deal with masks and not counts when 
> programming the hardware. Since apparently it is counts both for 
> slices and subslices, I am mightily confused as to how media-driver 
> would even theoretically be able to turn off a _specific_ (sub)slice?!


The fact that the feature needed isn't implemented at by the thread 
dispatcher is really strange to me too.

It sounds like we're forced to use a bigger hammer than what we really need.

As to how that maps to the right subslices is also unknown to me.

The only explanation I have is that subslices with no VME samplers get 
turn off first in the list of subslices to turn off.


Cheers,


-

Lionel


>
> Regards,
>
> Tvrtko
>
>>
>>>       }
>>>       if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
>>
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20180822/210fdd7e/attachment.html>


More information about the Intel-gfx mailing list