[Intel-gfx] [PATCH 02/21] drm/i915/tgl: Fix macros for TGL SOC based WA

Jani Nikula jani.nikula at intel.com
Tue Nov 24 14:20:40 UTC 2020


On Tue, 24 Nov 2020, Lucas De Marchi <lucas.demarchi at intel.com> wrote:
> On Mon, Nov 23, 2020 at 05:32:22PM -0800, Aditya Swarup wrote:
>>On 11/18/20 1:18 AM, Jani Nikula wrote:
>>> On Tue, 17 Nov 2020, Lucas De Marchi <lucas.demarchi at intel.com> wrote:
>>>> On Tue, Nov 17, 2020 at 10:50:10AM -0800, Aditya Swarup wrote:
>>>>> @@ -1579,9 +1579,9 @@ static inline const struct i915_rev_steppings *
>>>>> tgl_revids_get(struct drm_i915_private *dev_priv)
>>>>> {
>>>>> 	if (IS_TGL_U(dev_priv) || IS_TGL_Y(dev_priv))
>>>>> -		return tgl_uy_revids;
>>>>> +		return tgl_uy_revids + INTEL_REVID(dev_priv);
>>>>
>>>> oohh, no. You have to at least check you are not accessing out of
>>>> bounds. New HW running on old kernel should not access create invalid
>>>> accesses like this.
>>>
>>> And this is just one reason why exposing arrays directly as an interface
>>> to the rest of the driver is a bad idea. Basically I look at *all*
>>> externs in the driver with suspicion, and they're all exceptions that
>>> should not be repeated. The revid arrays are a direct invitation to keep
>>> adding more and more extern arrays. And more ways to go out of bounds.
>>
>>We definitely need an array table for the SOC -> Display, GT stepping mapping.
>
> the mapping could be very well in the define iff you don't have
> different mappings per sku as is the case with TGL. Example:
>
> #define ADLS_REVID_A0		0
> #define ADLS_REVID_A1		5
>
> #define ADLS_DISP_REVID_A0	0
> #define ADLS_DISP_REVID_B0	5
>
> The actual value is actually the *SoC* revid, regardless the name of the
> macro. Since we already have to use a different macro -
> IS_DISP_REVID() - I don't think this is much worse and would allow us to
> get rid of the table *for ADL-S*, at the expense of having to pass as
> argument the ADLS_DISP_REVID_*.  However this doesn't apply to TGL as TGL
> has a different mapping per sku.
>
>
>>SOC steppings were usually the same as display steppings/GT steppings until TGL and therefore
>>didn't require special mapping cases. But from TGL onwards, we have different combinations of
>>Disp and GT steppings per SOC stepping. Alderlake-S makes this direct mapping even more difficult
>>without the array requiring more macros to deal with SOC -> DISP/GT stepping differences.
>>
>>Will fix the array bound checks but the possibility of SOC revision id from drm struct going
>>out of bounds is minimal. Can only happen if we don't have support for latest SOC -> Disp/GT table
>
> this is very common. It's just a matter of trying to run a slightly old
> kernel in a slightly newer rev of the hardware.

Indeed. All kernels released with the arrays are simply bust for any new
hardware revisions. They'll need a minimal Cc: stable fix.

Here's something I drafted [1] to fix the situation more
generally. There are still some issues to overcome, though they exist
already in the current code.

This could be followed up with converting *all* platforms to the scheme,
making it universal, regardless of whether the revids in the hardware
are consecutive or not.

BR,
Jani.


[1] https://cgit.freedesktop.org/~jani/drm/log/?h=revid-stepping-scheme




>
>>for TGL from Bspec and if we are picking up wrong revision id from drm struct that means the platform
>>information obtained itself is wrong which will be a general platform problem unrelated to Gfx driver.
>
> Nothing else should really be a problem. We don't really use the revid
> much, mostly for WAs. And if other parts of the kernel are trying to use
> the SoC revid, then they are reading that info themselves, not using
> something we read.
>
> We are simply reading the revid from hardware and using that value
> without checking and that needs to change.
>
>
>>
>>>
>>> I'd rather we seek for ways to either nuke the revid arrays altogether,
>>> or encapsulate them within a .c file with static scope.
>>
>>I don't think we should nuke the revid arrays but I agree with finding a more appropriate place to
>>parse the gt/display stepping info. This should be an exercise for a later patch that takes
>>care of kbl,tgl and adl-s mappings.
>>
>>>
>>> And for that .c file... the arrays are now in gt/intel_workarounds.c
>>> which is a really weird place for stuff that's used for generic stepping
>>> info, and particularly for *display* stepping info.
>>
>>I agree and we can change the approach with a different patch later.
>>
>>>
>>> BR,
>>> Jani.
>>>
>>>
>>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Graphics Center


More information about the Intel-gfx mailing list