[Intel-gfx] [PATCH 00/15] HuC loading for DG2
Ceraolo Spurio, Daniele
daniele.ceraolospurio at intel.com
Mon Jun 13 17:06:33 UTC 2022
On 6/13/2022 9:56 AM, Tvrtko Ursulin wrote:
>
> On 13/06/2022 17:41, Ceraolo Spurio, Daniele wrote:
>> On 6/13/2022 9:31 AM, Tvrtko Ursulin wrote:
>>>
>>> On 13/06/2022 16:39, Ceraolo Spurio, Daniele wrote:
>>>> On 6/13/2022 1:16 AM, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/06/2022 00:19, Daniele Ceraolo Spurio wrote:
>>>>>> On DG2, HuC loading is performed by the GSC, via a PXP command.
>>>>>> The load
>>>>>> operation itself is relatively simple (just send a message to the
>>>>>> GSC
>>>>>> with the physical address of the HuC in LMEM), but there are timing
>>>>>> changes that requires special attention. In particular, to send a
>>>>>> PXP
>>>>>> command we need to first export the GSC driver and then wait for the
>>>>>> mei-gsc and mei-pxp modules to start, which means that HuC load will
>>>>>> complete after i915 load is complete. This means that there is a
>>>>>> small
>>>>>> window of time after i915 is registered and before HuC is loaded
>>>>>> during which userspace could submit and/or checking the HuC load
>>>>>> status,
>>>>>> although this is quite unlikely to happen (HuC is usually loaded
>>>>>> before
>>>>>> kernel init/resume completes).
>>>>>> We've consulted with the media team in regards to how to handle
>>>>>> this and
>>>>>> they've asked us to do the following:
>>>>>>
>>>>>> 1) Report HuC as loaded in the getparam IOCTL even if load is
>>>>>> still in
>>>>>> progress. The media driver uses the IOCTL as a way to check if
>>>>>> HuC is
>>>>>> enabled and then includes a secondary check in the batches to get
>>>>>> the
>>>>>> actual status, so doing it this way allows userspace to keep working
>>>>>> without changes.
>>>>>>
>>>>>> 2) Stall all userspace VCS submission until HuC is loaded. Stalls
>>>>>> are
>>>>>> expected to be very rare (if any), due to the fact that HuC is
>>>>>> usually
>>>>>> loaded before kernel init/resume is completed.
>>>>>
>>>>> Motivation to add these complications into i915 are not clear to
>>>>> me here. I mean there is no HuC on DG2 _yet_ is the premise of the
>>>>> series, right? So no backwards compatibility concerns. In this
>>>>> case why jump through the hoops and not let userspace handle all
>>>>> of this by just leaving the getparam return the true status?
>>>>
>>>> The main areas impacted by the fact that we can't guarantee that
>>>> HuC load is complete when i915 starts accepting submissions are
>>>> boot and suspend/resume, with the latter being the main problem; GT
>>>> reset is not a concern because HuC now survives it. A
>>>> suspend/resume can be transparent to userspace and therefore the
>>>> HuC status can temporarily flip from loaded to not without
>>>> userspace knowledge, especially if we start going into deeper
>>>> suspend states and start causing HuC resets when we go into runtime
>>>> suspend. Note that this is different from what happens during GT
>>>> reset for older platforms, because in that scenario we guarantee
>>>> that HuC reload is complete before we restart the submission
>>>> back-end, so userspace doesn't notice that the HuC status change.
>>>> We had an internal discussion about this problem with both media
>>>> and i915 archs and the conclusion was that the best option is for
>>>> i915 to stall media submission while HuC (re-)load is in progress.
>>>
>>> Resume is potentialy a good reason - I did not pick up on that from
>>> the cover letter. I read the statement about the unlikely and small
>>> window where HuC is not loaded during kernel init/resume and I guess
>>> did not pick up on the resume part.
>>>
>>> Waiting for GSC to load HuC from i915 resume is not an option?
>>
>> GSC is an aux device exported by i915, so AFAIU GSC resume can't
>> start until i915 resume completes.
>
> I'll dig into this in the next few days since I want to understand how
> exactly it works. Or someone can help explain.
>
> If in the end conclusion will be that i915 resume indeed cannot wait
> for GSC, then I think auto-blocking of queued up contexts on media
> engines indeed sounds unavoidable. Otherwise, as you explained, user
> experience post resume wouldn't be good.
Even if we could implement a wait, I'm not sure we should. GSC resume
and HuC reload takes ~300ms in most cases, I don't think we want to
block within the i915 resume path for that long.
>
> However, do we really need to lie in the getparam? How about extend or
> add a new one to separate the loading vs loaded states? Since
> userspace does not support DG2 HuC yet this should be doable.
I don't really have a preference here. The media team asked us to do it
this way because they wouldn't have a use for the different "in
progress" and "done" states. If they're ok with having separate flags
that's fine by me.
Tony, any feedback here?
Thanks,
Daniele
>
>>> Will there be runtime suspend happening on the GSC device behind
>>> i915's back, or i915 and GSC will always be able to transition the
>>> states in tandem?
>>
>> They're always in sync. The GSC is part of the same HW PCI device as
>> the rest of the GPU, so they change HW state together.
>
> Okay thanks, I wasn't sure if it is the same or separate device.
>
> Regards,
>
> Tvrtko
More information about the Intel-gfx
mailing list