[Intel-gfx] [CI 11/15] drm/i915/huc: track delayed HuC load with a fence

Ceraolo Spurio, Daniele daniele.ceraolospurio at intel.com
Mon Nov 7 18:38:14 UTC 2022



On 11/4/2022 6:27 PM, Brian Norris wrote:
> Hi,
>
> On Fri, Nov 04, 2022 at 05:49:54PM -0700, Ceraolo Spurio, Daniele wrote:
>> On 11/4/2022 5:38 PM, Ceraolo Spurio, Daniele wrote:
>>> On 11/4/2022 4:26 PM, Brian Norris wrote:
>>>> Did you track this down? Or consider reverting? This is tripping me up
>>> No. I didn't manage to repro locally after Tvrtko reported it (I run the
>>> full selftest suite twice on both ADL-S and DG2 with the debug config
>>> enabled), so I was keeping an eye out as suggested to see if it popped
>>> out again. If you can repro this consistently, can you share your setup
>>> info? What platform you're running on, if you're using the latest
>>> drm-tip, any non-default params you're using, etc. Dmesg would also be
>>> useful to see if there are other errors before this one.
>>>
>> Just to further clarify, this issue is also not showing up in our CI runs
>> (which do have both the DEBUG_OBJECTS kconfigs you pointed out enabled),
>> hence why I'm suspecting that this is only happening on specific setups,
>> potentially due to a different kconfig or modparam being involved.
> Huh, well join the crowd. I'm currently hunting through ways to
> reproduce the CI runs, which are complaining about a different patch of
> mine...
> ...and I can't reproduce :)
>
> Anyway, I'm running on a GLK Chromebook. I have to do some minimal
> tweaking to get the average ChromeOS setup to work (basically, neuter
> the display manager and boot splash, so DRM/drivers can release
> cleanly), but then the IGT tools run as normal. Attaching dmesg and
> .config.
>
> Test sequence:
>
>    igt-gpu-tools/i915_module_load --run-subtest reload  ## this first one is probably unnecessary
>    igt-gpu-tools/gem_exec_gttfill --run-subtest basic
>    igt-gpu-tools/i915_module_load --run-subtest reload
>
> I'm running drm-tip, at:
>
>    a397a9098fb3 drm-tip: 2022y-11m-04d-19h-23m-35s UTC integration manifest
>
> I doubt too much of the ChromeOS setup itself is uniquely interesting,
> other than perhaps that we run a simple 'frecon' console [1] that I had
> to kill first (so, it probably touched/released some buffers).
>
> Brian
>
> [1] https://chromium.googlesource.com/chromiumos/platform/frecon/+/HEAD/README.md

Ok, I think I have an idea of what's happening: if HuC is not enabled, 
we skip the call to fence_fini, so we leak the debug object. Can you 
check if the below diff fixes the issue for you?

Thanks,
Daniele

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
index fbc8bae14f76..e3bbd174889d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c
@@ -300,13 +300,12 @@ int intel_huc_init(struct intel_huc *huc)

  void intel_huc_fini(struct intel_huc *huc)
  {
-       if (!intel_uc_fw_is_loadable(&huc->fw))
-               return;
-
         delayed_huc_load_complete(huc);

         i915_sw_fence_fini(&huc->delayed_load.fence);
-       intel_uc_fw_fini(&huc->fw);
+
+       if (intel_uc_fw_is_loadable(&huc->fw))
+               intel_uc_fw_fini(&huc->fw);
  }



More information about the Intel-gfx mailing list