[RFC] drm/xe/guc: Don't support GuC older GuC 70.x releases
Lucas De Marchi
lucas.demarchi at intel.com
Wed Feb 7 20:40:21 UTC 2024
On Wed, Feb 07, 2024 at 10:34:07AM -0800, Daniele Ceraolo Spurio wrote:
>
>
>On 2/7/2024 8:42 AM, Lucas De Marchi wrote:
>>+Gustavo who is dealing with DMC firmware lately
>>
>>On Wed, Feb 07, 2024 at 03:30:59AM +0000, Matthew Brost wrote:
>>>On Tue, Feb 06, 2024 at 05:18:50PM -0800, John Harrison wrote:
>>>>On 2/6/2024 15:41, Daniele Ceraolo Spurio wrote:
>>>>> Supporting older GuC versions comes with baggage, both on the coding
>>>>> side (due to interfaces only being available from a certain version
>>>>> onwards) and on the testing side (due to having to make sure
>>>>the driver
>>>>> works as expected with older GuCs).
>>>>> Since all of our Xe platform are still under force probe, we haven't
>>>>> committed to support any specific GuC version and we therefore don't
>>>>> need to support the older once, which means that we can force
>>>>a bottom
>>>>> limit to what GuC we accept. This allows us to remove any conditional
>>>>> statements based on older GuC versions and also to approach newer
>>>>> additions knowing that we'll never attempt to load something older
>>>>> than our minimum requirement.
>>>>>
>>>>> RFC: this patch sets the minimum to the current GuC version (70.19),
>>>>> but that can be moved one way or the other. The main aim here is
>>>>Ideally, this would be bumped every time we update Xe to a newer
>>>>firmware
>>>>version right up to the point when force probe is lifted. At
>>>>that point it
>>>>becomes fixed and we have to add the version check support back in for
>>>>future w/a's and features.
>>>>
>>>>Get's my vote :).
>>
>>Yeah, but see my other reply... I think we will have to wait the
>>firmware being available in linux-firmware for that.
>>
>>Also, let's kickstart a discussion on our process with some
>>possible changes so we can get it documented. I think we have a good
>>opportunity here to start adopting the
>>https://gitlab.freedesktop.org/drm/firmware repo.
>>
>>Rough idea:
>>
>>1) use intel-staging branch with tags for pull requests to
>> linux-firmware, just like documented in their readme.
>> IMO the naming is rather unfortunate since it would be
>> good to use it for (2) below.... but since it's already used
>> we can use something else.
>>
>> this would mainly replace the use we have today for
>> https://cgit.freedesktop.org/drm/drm-firmware/ , which could
>>be retired. From upstream linux-firmware pov the only
>> change would be the remote location and that we start using tags
>> for the pull requests, coming from a single branch regardless of
>> the firmware (guc, huc, dmc, gsc): intel-staging. Once accepted in
>> linux-firmware, the branch is fast-forwarded.
>
>I think this needs a bit more fleshing out, because before we do a
>pull request, we do want to run CI on the blobs. Also, in several
>occasions we went through a couple of versions before we closed on
>what to push to linux-firmware (e.g. in the latest push we started
>with 70.19.1 but then pushed 70.19.2), so we can't go to intel-staging
>until we're actually ready to push. I think the process you have below
>for mmp blobs should work for this early testing flow as well, but we
>might end up with a lot of noise in the staging-intel-for-CI branch.
that would be a throw away branch where we push stuff to be able to test
on CI. I don't think the commit history matters much there. The fact
that the firmware is available to match what is in the kernel and that
there's a documented process for using it in my view trumps the
this downside.
what I expect would be, considering the LNL case as example:
1) Start testing with the mmp version:
a) Add firmware to drm/firmware intel-staging-for-CI
b) Add commit in topic/xe-for-CI on the kernel side to make
use of that firmware
2) Ooops, that has bugs
a) add a second mmp firmware to drm/firmware intel-staging-for-CI
b) replace commit in topic/xe-for-CI on the kernel side
3) we think we are good, let's try for real
a) Add lnl_guc_70.bin to drm/firmware intel-staging-for-CI
b) replace commit in topic/xe-for-CI on the kernel side
4) yay, it worked
a) Add that lnl_gu_70.bin firmware to intel-staging branch and
prepare pull request to linux-firmware
b) move patch from topic/xe-for-CI to drm-xe-next: i.e., rebase
topic/xe-for-CI on top of drm-xe-next leaving that commit as
first one. git push topic/xe-for-CI, dim push drm-xe-next (or
implement the logic in dim to push 2 branches)
We may need some time between (a) and (b) depending on where we
are on the kernel release cycle: we don't want to submit a
kernel pull request before the firmware is available @
linux-firmware repo.
Note that the fact we are using mmp makes it more complex, although
explicit. Going direct with lnl_gu_70.bin would also work and avoid
updating the commits on the kernel side.
>
>We also need some rules to handle the case where there is already a PR
>in flight and we need to push some more blobs. This might be as easy
>as the committer seeing that there are commits on top of master,
>replying to the previous PR to deprecate it, and then generating a new
>PR with all the blobs.
the pull requests to linux-firmware would come from tags, not a branch.
So you have (tip of the branch is on top):
o <intel-staging> intel: Add lnl_guc_70.bin
o <refs/tags/intel-2024-01-30> intel: Update dg2_guc_70.bin <-- last in flight pull request
o intel: Add lnl_dmc.bin
o <origin/main> .... <-- where linux-firmware is at
Looking at amd-staging, it seems to match what they are doing:
https://gitlab.freedesktop.org/drm/firmware/-/commits/amd-staging?ref_type=heads
see the amd-$DATE tags
>
>
>>
>>2) mmp firmware versions are only ever pushed to a separate
>>staging-intel-for-CI
>> branch. There is no pull request in the mailing for this. We can
>>either
>> push directly to the branch or create MRs in gitlab. CI would start
>> using this branch for the extra firmware for platforms instead of
>> whatever it's using today to process the pull requests from the
>> mailing list. Or whatever it's using, because I don't know and don't
>> see it documented anywhere.
>
>As long as the CI team is ok with this, I'm all for it.
>
>>
>> The patch on the kernel side to use the mmp firmware is only ever
>> pushed to the topic/xe-for-CI branch since a) the firmware is
>>coming from
>> a non-official location and b) end users and distro packaging
>> shouldn't see a warning when building the kernel due to a possibly
>> missing firmware
>>3) Raising firmware version requirement for past platforms used as
>> SDV can be done **unless** it raises the major version. That's because
>> end users would start seeing the warning that we avoided in (2).
>
>Who are the end users here? If we're talking about older
>non-officially supported platforms, the only users should be
>developers and they should be able to handle having to update the
>firmwares to a newer major versions.
distros and any developer outside Intel. The kernel build system is
unaware of xe.force_probe. So if you have, after the several macros:
MODULE_FIRMWARE("xe/tgl_guc_71.bin")
It will show up in `modinfo -f firmware xe`. And it will show as a
warning when installing/packaging a kernel.
It doesn't matter for minor/patch updates because the file name is
major-only and **running** with that module is protected by the
force_probe. The major may be updated when it's available in
linux-firmware, which means i915 started using it (for i915 that would
be "as an option, with fallback to the previous major release" of
course).
Lucas De Marchi
>
>Daniele
>
>>
>>thoughts?
>>
>>Lucas De Marchi
>>
>>>>
>>>
>>>Mine too.
>>>
>>>With that:
>>>Acked-by: Matthew Brost <matthew.brost at intel.com>
>>>
>>>>John.
>>>>
>>>>> agreeing to stop supporting very old GuC releases on the newer
>>>>driver.
>>>>>
>>>>> Signed-off-by: Daniele Ceraolo Spurio
>>>><daniele.ceraolospurio at intel.com>
>>>>> Cc: John Harrison <John.C.Harrison at Intel.com>
>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>>> Cc: Matt Roper <matthew.d.roper at intel.com>
>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/xe/xe_guc.c | 14 ++------------
>>>>> drivers/gpu/drm/xe/xe_uc_fw.c | 36
>>>>++++++++++++++---------------------
>>>>> 2 files changed, 16 insertions(+), 34 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc.c
>>>>b/drivers/gpu/drm/xe/xe_guc.c
>>>>> index 868208a39829..5e6b27aac495 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_guc.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_guc.c
>>>>> @@ -132,15 +132,10 @@ static u32 guc_ctl_ads_flags(struct
>>>>xe_guc *guc)
>>>>> return flags;
>>>>> }
>>>>> -#define GUC_VER(maj, min, pat) (((maj) << 16) | ((min) <<
>>>>8) | (pat))
>>>>> -
>>>>> static u32 guc_ctl_wa_flags(struct xe_guc *guc)
>>>>> {
>>>>> struct xe_device *xe = guc_to_xe(guc);
>>>>> struct xe_gt *gt = guc_to_gt(guc);
>>>>> - struct xe_uc_fw *uc_fw = &guc->fw;
>>>>> - struct xe_uc_fw_version *version =
>>>>&uc_fw->versions.found[XE_UC_FW_VER_RELEASE];
>>>>> -
>>>>> u32 flags = 0;
>>>>> if (XE_WA(gt, 22012773006))
>>>>> @@ -170,13 +165,8 @@ static u32 guc_ctl_wa_flags(struct xe_guc *guc)
>>>>> if (XE_WA(gt, 1509372804))
>>>>> flags |= GUC_WA_RENDER_RST_RC6_EXIT;
>>>>> - if (XE_WA(gt, 14018913170)) {
>>>>> - if (GUC_VER(version->major, version->minor,
>>>>version->patch) >= GUC_VER(70, 7, 0))
>>>>> - flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
>>>>> - else
>>>>> - drm_dbg(&xe->drm, "Skip WA 14018913170: GUC
>>>>version expected >= 70.7.0, found %u.%u.%u\n",
>>>>> - version->major, version->minor, version->patch);
>>>>> - }
>>>>> + if (XE_WA(gt, 14018913170))
>>>>> + flags |= GUC_WA_ENABLE_TSC_CHECK_ON_RC6;
>>>>> return flags;
>>>>> }
>>>>> diff --git a/drivers/gpu/drm/xe/xe_uc_fw.c
>>>>b/drivers/gpu/drm/xe/xe_uc_fw.c
>>>>> index 4714f2c8d2ba..e5bf59616f3d 100644
>>>>> --- a/drivers/gpu/drm/xe/xe_uc_fw.c
>>>>> +++ b/drivers/gpu/drm/xe/xe_uc_fw.c
>>>>> @@ -296,36 +296,28 @@ static void uc_fw_fini(struct drm_device
>>>>*drm, void *arg)
>>>>> xe_uc_fw_change_status(uc_fw, XE_UC_FIRMWARE_SELECTED);
>>>>> }
>>>>> -static void guc_read_css_info(struct xe_uc_fw *uc_fw, struct
>>>>uc_css_header *css)
>>>>> +static int guc_read_css_info(struct xe_uc_fw *uc_fw, struct
>>>>uc_css_header *css)
>>>>> {
>>>>> struct xe_gt *gt = uc_fw_to_gt(uc_fw);
>>>>> struct xe_uc_fw_version *release =
>>>>&uc_fw->versions.found[XE_UC_FW_VER_RELEASE];
>>>>> struct xe_uc_fw_version *compatibility =
>>>>&uc_fw->versions.found[XE_UC_FW_VER_COMPATIBILITY];
>>>>> xe_gt_assert(gt, uc_fw->type == XE_UC_FW_TYPE_GUC);
>>>>> - xe_gt_assert(gt, release->major >= 70);
>>>>> -
>>>>> - if (release->major > 70 || release->minor >= 6) {
>>>>> - /* v70.6.0 adds CSS header support */
>>>>> - compatibility->major = FIELD_GET(CSS_SW_VERSION_UC_MAJOR,
>>>>> - css->submission_version);
>>>>> - compatibility->minor = FIELD_GET(CSS_SW_VERSION_UC_MINOR,
>>>>> - css->submission_version);
>>>>> - compatibility->patch = FIELD_GET(CSS_SW_VERSION_UC_PATCH,
>>>>> - css->submission_version);
>>>>> - } else if (release->minor >= 3) {
>>>>> - /* v70.3.0 introduced v1.1.0 */
>>>>> - compatibility->major = 1;
>>>>> - compatibility->minor = 1;
>>>>> - compatibility->patch = 0;
>>>>> - } else {
>>>>> - /* v70.0.0 introduced v1.0.0 */
>>>>> - compatibility->major = 1;
>>>>> - compatibility->minor = 0;
>>>>> - compatibility->patch = 0;
>>>>> +
>>>>> + /* We don't support GuC releases older than 70.19 */
>>>>> + if (release->major < 70 || (release->major == 70 &&
>>>>release->minor < 19)) {
>>>>> + xe_gt_err(gt, "Unsupported GuC v%u.%u! v70.19 or
>>>>newer is required\n",
>>>>> + release->major, release->minor);
>>>>> + return -EINVAL;
>>>>> }
>>>>> + compatibility->major = FIELD_GET(CSS_SW_VERSION_UC_MAJOR,
>>>>css->submission_version);
>>>>> + compatibility->minor = FIELD_GET(CSS_SW_VERSION_UC_MINOR,
>>>>css->submission_version);
>>>>> + compatibility->patch = FIELD_GET(CSS_SW_VERSION_UC_PATCH,
>>>>css->submission_version);
>>>>> +
>>>>> uc_fw->private_data_size = css->private_data_size;
>>>>> +
>>>>> + return 0;
>>>>> }
>>>>> int xe_uc_fw_check_version_requirements(struct xe_uc_fw *uc_fw)
>>>>> @@ -424,7 +416,7 @@ static int parse_css_header(struct
>>>>xe_uc_fw *uc_fw, const void *fw_data, size_t
>>>>> release->patch = FIELD_GET(CSS_SW_VERSION_UC_PATCH,
>>>>css->sw_version);
>>>>> if (uc_fw->type == XE_UC_FW_TYPE_GUC)
>>>>> - guc_read_css_info(uc_fw, css);
>>>>> + return guc_read_css_info(uc_fw, css);
>>>>> return 0;
>>>>> }
>>>>
>
More information about the Intel-xe
mailing list