[Intel-gfx] [PATCH v9 6/7] drm/i915/guc: Check the locking status of GuC WOPCM registers

Sun Feb 11 00:36:23 UTC 2018

On 02/09/2018 02:47 AM, Joonas Lahtinen wrote:
>> +++ b/drivers/gpu/drm/i915/intel_guc_wopcm.c
>> +static inline bool guc_wopcm_locked(struct intel_guc *guc)
>> +{
>> +       struct drm_i915_private *i915 = guc_to_i915(guc);
>> +       bool size_reg_locked = __reg_locked(i915, GUC_WOPCM_SIZE);
>> +       bool offset_reg_locked = __reg_locked(i915, DMA_GUC_WOPCM_OFFSET);
>> +
>> +       return size_reg_locked && offset_reg_locked;
> Hmm, isn't this supposed to be || instead, if either of the registers is
> locked, we can't really do much?
It is an issue!

The intention was letting following GEM_BUG_ON to fail the
"only one reg" locked case. However, after adding the validation
of the already locked reg values, I think we need update the logic
code a little bit (to use the locked reg if it contained the valid value).

Also, I think it makes sense to introduce some helper
functions as you mentioned below to make things clearer.
so I will rework this part of code to make it clearer and address
this issue at the same time.

Thanks & Regards,
-Jackie
>> +static inline void guc_wopcm_hw_update(struct intel_guc *guc)
>> +{
>> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +       u32 offset_reg_val;
>> +
>> +       /* GuC WOPCM registers should be unlocked at this point. */
>> +       GEM_BUG_ON(__reg_locked(dev_priv, GUC_WOPCM_SIZE));
>> +       GEM_BUG_ON(__reg_locked(dev_priv, DMA_GUC_WOPCM_OFFSET));
>> +
>> +       offset_reg_val = guc->wopcm.offset;
>> +       if (guc->wopcm.need_load_huc_fw)
> "load_huc_fw" would read better here.
>
>> +               offset_reg_val |= HUC_LOADING_AGENT_GUC;
>> +
>> +       I915_WRITE(GUC_WOPCM_SIZE, guc->wopcm.size);
>> +       I915_WRITE(DMA_GUC_WOPCM_OFFSET, offset_reg_val);
>> +
>> +       GEM_BUG_ON(!__reg_locked(dev_priv, GUC_WOPCM_SIZE));
>> +       GEM_BUG_ON(!__reg_locked(dev_priv, DMA_GUC_WOPCM_OFFSET));
> This could use static inline write_lockable_reg() helper with a return value.
> I think we need to propagate the error all the way up the chain, as it's
> not a driver programmer error if these registers are locked, instead it is
> likely to be a BIOS problem.
>
> In the helper we could also test if the value is locked but happens to
> match what we were intending to write, then we could call it success and
> maybe emit an INFO message.
>
>> +}
>> +
>> +static inline bool guc_wopcm_regs_valid(struct intel_guc *guc)
>> +{
>> +       struct drm_i915_private *dev_priv = guc_to_i915(guc);
>> +       u32 size, offset;
>> +       bool guc_loads_huc;
>> +       u32 reg_val;
>> +
>> +       reg_val = I915_READ(GUC_WOPCM_SIZE);
>> +       /* GuC WOPCM size should be always multiple of 4K pages. */
>> +       size = reg_val & PAGE_MASK;
>> +
>> +       reg_val = I915_READ(DMA_GUC_WOPCM_OFFSET);
>> +       guc_loads_huc = reg_val & HUC_LOADING_AGENT_GUC;
>> +       offset = reg_val & GUC_WOPCM_OFFSET_MASK;
>> +
>> +       if (guc->wopcm.need_load_huc_fw && !guc_loads_huc)
>> +               return false;
>> +
>> +       return (size == guc->wopcm.size) && (offset == guc->wopcm.offset);
>> +}
> This seems to have much of what I wrote above already. I'd still split
> the actual writes in a helper that will report the per-register a
> WARN about "mismatch between locked register value and computed value"
> or so. Dumping both the computed an actual register value.
>
>> @@ -147,6 +211,8 @@ int intel_guc_wopcm_init(struct intel_guc_wopcm *guc_wopcm, u32 guc_fw_size,
>>          guc->wopcm.offset = offset;
>>          guc->wopcm.size = size;
>>          guc->wopcm.top = top;
>> +       /* Use GuC to load HuC firmware if HuC firmware is present. */
> Comment is again more on the "what" side, so can be dropped.
>
>> +       guc->wopcm.need_load_huc_fw = huc_fw_size ? 1 : 0;
> 	... = huc_fw_size > 0;
>
> Will literally read as "set guc load_huc_fw if huc_fw_size is
> greater than zero."
>
>> +int intel_guc_wopcm_init_hw(struct intel_guc_wopcm *guc_wopcm)
>> +{
>> +       struct intel_guc *guc = guc_wopcm_to_guc(guc_wopcm);
>> +       bool locked = guc_wopcm_locked(guc);
>> +
>> +       GEM_BUG_ON(!guc->wopcm.valid);
>> +
>> +       /*
>> +        * Bug if driver hasn't updated the HW Registers and GuC WOPCM has been
>> +        * locked. Return directly if WOPCM was locked and we have updated
>> +        * the registers.
>> +        */
>> +       if (locked) {
>> +               if (guc->wopcm.hw_updated)
>> +                       return 0;
>> +
>> +               /*
>> +                * Mark as updated if registers contained correct values.
>> +                * This will happen while reloading the driver module without
>> +                * rebooting the system.
>> +                */
>> +               if (guc_wopcm_regs_valid(guc))
>> +                       goto out;
>> +
>> +               /* Register locked without valid values. Abort HW init. */
>> +               return -EINVAL;
>> +       }
>> +
>> +       /* Always update registers when GuC WOPCM is not locked. */
>> +       guc_wopcm_hw_update(guc);
> This flow will become more clear with the helpers, and we will get a
> splat in the kernel message with less code about which register actually
> was locked and how the value mismatched.
>
> I'm up for discussion, but to my eyes the code flow would become more clear if
> we start with the assumption "we can write the register values to what we have
> calculated", and then make it an error that is propagated back up if we can't.
>
> Regards, Joonas