[Intel-gfx] [PATCH 00/15] HuC loading for DG2

Ceraolo Spurio, Daniele daniele.ceraolospurio at intel.com
Mon Jun 13 17:06:33 UTC 2022



On 6/13/2022 9:56 AM, Tvrtko Ursulin wrote:
>
> On 13/06/2022 17:41, Ceraolo Spurio, Daniele wrote:
>> On 6/13/2022 9:31 AM, Tvrtko Ursulin wrote:
>>>
>>> On 13/06/2022 16:39, Ceraolo Spurio, Daniele wrote:
>>>> On 6/13/2022 1:16 AM, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 10/06/2022 00:19, Daniele Ceraolo Spurio wrote:
>>>>>> On DG2, HuC loading is performed by the GSC, via a PXP command. 
>>>>>> The load
>>>>>> operation itself is relatively simple (just send a message to the 
>>>>>> GSC
>>>>>> with the physical address of the HuC in LMEM), but there are timing
>>>>>> changes that requires special attention. In particular, to send a 
>>>>>> PXP
>>>>>> command we need to first export the GSC driver and then wait for the
>>>>>> mei-gsc and mei-pxp modules to start, which means that HuC load will
>>>>>> complete after i915 load is complete. This means that there is a 
>>>>>> small
>>>>>> window of time after i915 is registered and before HuC is loaded
>>>>>> during which userspace could submit and/or checking the HuC load 
>>>>>> status,
>>>>>> although this is quite unlikely to happen (HuC is usually loaded 
>>>>>> before
>>>>>> kernel init/resume completes).
>>>>>> We've consulted with the media team in regards to how to handle 
>>>>>> this and
>>>>>> they've asked us to do the following:
>>>>>>
>>>>>> 1) Report HuC as loaded in the getparam IOCTL even if load is 
>>>>>> still in
>>>>>> progress. The media driver uses the IOCTL as a way to check if 
>>>>>> HuC is
>>>>>> enabled and then includes a secondary check in the batches to get 
>>>>>> the
>>>>>> actual status, so doing it this way allows userspace to keep working
>>>>>> without changes.
>>>>>>
>>>>>> 2) Stall all userspace VCS submission until HuC is loaded. Stalls 
>>>>>> are
>>>>>> expected to be very rare (if any), due to the fact that HuC is 
>>>>>> usually
>>>>>> loaded before kernel init/resume is completed.
>>>>>
>>>>> Motivation to add these complications into i915 are not clear to 
>>>>> me here. I mean there is no HuC on DG2 _yet_ is the premise of the 
>>>>> series, right? So no backwards compatibility concerns. In this 
>>>>> case why jump through the hoops and not let userspace handle all 
>>>>> of this by just leaving the getparam return the true status?
>>>>
>>>> The main areas impacted by the fact that we can't guarantee that 
>>>> HuC load is complete when i915 starts accepting submissions are 
>>>> boot and suspend/resume, with the latter being the main problem; GT 
>>>> reset is not a concern because HuC now survives it. A 
>>>> suspend/resume can be transparent to userspace and therefore the 
>>>> HuC status can temporarily flip from loaded to not without 
>>>> userspace knowledge, especially if we start going into deeper 
>>>> suspend states and start causing HuC resets when we go into runtime 
>>>> suspend. Note that this is different from what happens during GT 
>>>> reset for older platforms, because in that scenario we guarantee 
>>>> that HuC reload is complete before we restart the submission 
>>>> back-end, so userspace doesn't notice that the HuC status change. 
>>>> We had an internal discussion about this problem with both media 
>>>> and i915 archs and the conclusion was that the best option is for 
>>>> i915 to stall media submission while HuC (re-)load is in progress.
>>>
>>> Resume is potentialy a good reason - I did not pick up on that from 
>>> the cover letter. I read the statement about the unlikely and small 
>>> window where HuC is not loaded during kernel init/resume and I guess 
>>> did not pick up on the resume part.
>>>
>>> Waiting for GSC to load HuC from i915 resume is not an option?
>>
>> GSC is an aux device exported by i915, so AFAIU GSC resume can't 
>> start until i915 resume completes.
>
> I'll dig into this in the next few days since I want to understand how 
> exactly it works. Or someone can help explain.
>
> If in the end conclusion will be that i915 resume indeed cannot wait 
> for GSC, then I think auto-blocking of queued up contexts on media 
> engines indeed sounds unavoidable. Otherwise, as you explained, user 
> experience post resume wouldn't be good.

Even if we could implement a wait, I'm not sure we should. GSC resume 
and HuC reload takes ~300ms in most cases, I don't think we want to 
block within the i915 resume path for that long.

>
> However, do we really need to lie in the getparam? How about extend or 
> add a new one to separate the loading vs loaded states? Since 
> userspace does not support DG2 HuC yet this should be doable.

I don't really have a preference here. The media team asked us to do it 
this way because they wouldn't have a use for the different "in 
progress" and "done" states. If they're ok with having separate flags 
that's fine by me.
Tony, any feedback here?

Thanks,
Daniele

>
>>> Will there be runtime suspend happening on the GSC device behind 
>>> i915's back, or i915 and GSC will always be able to transition the 
>>> states in tandem?
>>
>> They're always in sync. The GSC is part of the same HW PCI device as 
>> the rest of the GPU, so they change HW state together.
>
> Okay thanks, I wasn't sure if it is the same or separate device.
>
> Regards,
>
> Tvrtko



More information about the Intel-gfx mailing list