[Intel-gfx] [PATCH v2] drm/i915/bxt: Broxton decoupled MMIO

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Mon Sep 26 20:23:06 UTC 2016


On 26/09/2016 12:08, Paneri, Praveen wrote:
>
> On 9/23/2016 3:19 PM, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 19/09/2016 18:15, Praveen Paneri wrote:
>>

[snip]

>>
>>> +
>>>   enum forcewake_domains
>>>   intel_uncore_forcewake_for_reg(struct drm_i915_private *dev_priv,
>>>                      i915_reg_t reg, unsigned int op);
>>> @@ -2854,6 +2864,7 @@ struct drm_i915_cmd_table {
>>>   #define GT_FREQUENCY_MULTIPLIER 50
>>>   #define GEN9_FREQ_SCALER 3
>>>   +#define HAS_DECOUPLED_MMIO(dev_priv) (IS_BROXTON(dev_priv) &&
>>> IS_BXT_REVID(dev_priv, BXT_REVID_C0, REVID_FOREVER))
>>
>> There is a recent patch series from Carlos Santa which moved all these
>> type of things to device info. So I think you have to make this
>> compliant with that new style.
> I looked into it. Could you suggest where can I add the check for BXT 
> C0 revision?
> Will this be okay
> #define HAS_DECOUPLED_MMIO(dev) (INTEL_INFO(dev)->has_decoupled_mmio 
> && IS_BXT_REVID(to_i915(dev), BXT_REVID_C0, REVID_FOREVER))

Good point. I suggest a quick chat with Carlos then to see what was the 
plan for situation like this one.

[snip]

>>
>>> +
>>> +    ctrl_reg_data |= reg;
>>> +    ctrl_reg_data |= (operation << GEN9_DECOUPLED_OP_SHIFT);
>>> +    ctrl_reg_data |= (pd_engine << GEN9_DECOUPLED_PD_SHIFT);
>>> +    __raw_i915_write32(dev_priv, GEN9_DECOUPLED_REG0_DW1,
>>> ctrl_reg_data);
>>> +
>>> +    ctrl_reg_data |= GEN9_DECOUPLED_DW1_GO;
>>> +    __raw_i915_write32(dev_priv, GEN9_DECOUPLED_REG0_DW1,
>>> ctrl_reg_data);
>>> +
>>> +    if (wait_for_atomic((__raw_i915_read32(dev_priv,
>>> +            GEN9_DECOUPLED_REG0_DW1) & GEN9_DECOUPLED_DW1_GO) == 0,
>>> +            FORCEWAKE_ACK_TIMEOUT_MS))
>>> +        DRM_ERROR("Decoupled MMIO wait timed out\n");
>>> +
>>
>> Is FORCEWAKE_ACK_TIMEOUT_MS correct/relevant for decoupled MMIO? It is
>> potentially quite a long atomic wait.
> This is max wait time. We might not wait for that long but I can still 
> check on it.

Cool, I do think we need to know and document this in the commit and/or 
code comments where a brief explanation of decoupled mmio will be.

>>
>> Would it be worth returning some fixed value in the case of a timeout?
>> Might be better than random stuff? (applies in the 64-bit read case)
> This is same as forcewake implementation. If we change it, what would 
> be the appropriate fixed value to be returned?

Another good point. In that case I suppose it doesn't matter so can 
leave it like it was. It can only theoretically affect 64-bit reads, yes?

>>
>>> +    if (operation == GEN9_DECOUPLED_OP_READ)
>>> +        *ptr_data = __raw_i915_read32(dev_priv,
>>> +                GEN9_DECOUPLED_REG0_DW0);
>>> +}
>>> +
>>>   #define GEN2_READ_HEADER(x) \
>>>       u##x val = 0; \
>>>       assert_rpm_wakelock_held(dev_priv);
>>> @@ -892,6 +928,20 @@ static inline void __force_wake_auto(struct
>>> drm_i915_private *dev_priv,
>>>           dev_priv->uncore.funcs.force_wake_get(dev_priv, fw_domains);
>>>   }
>>>   +static inline bool __is_forcewake_active(struct drm_i915_private
>>> *dev_priv,
>>> +                     enum forcewake_domains fw_domains)
>>> +{
>>> +    struct intel_uncore_forcewake_domain *domain;
>>> +
>>> +    /* Ideally GCC would be constant-fold and eliminate this loop */
>>> +    for_each_fw_domain_masked(domain, fw_domains, dev_priv) {
>>> +        if (domain->wake_count)
>>> +            fw_domains &= ~domain->mask;
>>> +    }
>>> +
>>> +    return fw_domains ? 0 : 1;
>>> +}
>>> +
>>>   #define __gen6_read(x) \
>>>   static u##x \
>>>   gen6_read##x(struct drm_i915_private *dev_priv, i915_reg_t reg, bool
>>> trace) { \
>>> @@ -940,6 +990,37 @@ gen9_read##x(struct drm_i915_private *dev_priv,
>>> i915_reg_t reg, bool trace) { \
>>>       GEN6_READ_FOOTER; \
>>>   }
>>>   +#define __gen9_decoupled_read(x) \
>>> +static u##x \
>>> +gen9_decoupled_read##x(struct drm_i915_private *dev_priv, i915_reg_t
>>> reg, bool trace) { \
>>> +    enum forcewake_domains fw_engine; \
>>
>> fw_engines
>>
>>> +    GEN6_READ_HEADER(x); \
>>> +    fw_engine = __gen9_reg_read_fw_domains(offset); \
>>> +    if (fw_engine && x%32 == 0) { \
>>> +        if (__is_forcewake_active(dev_priv, fw_engine)) \
>>> +            __raw_i915_write##x(dev_priv, reg, val); \
>>
>> Write in the read macro, I don't understand!?
> typo, I will fix it.
>>
>> Also, would a single mmio read call be possible, something like below?
>>
>> if (x % 32 || !fw_engines || __is_forcewake_active()) {
>>     if (fw_engines)
>>         __force_wake_auto
>>      __raw_i915_read
>> } else {
>>     ... decoupled mmio loop ...
>> }
>>
>> I might have made an oversight, no guarantees that I haven't. :)
> Looks fine. Will test this

Cool, the only thing which would still bug me here is the verboseness of 
__is_forcewake_active but I have no smart ideas for that at the moment. 
Perhaps keeping a bitmask of active domains in the core? Could be worth 
it if for nothing for elegance.

Regards,

Tvrtko



More information about the Intel-gfx mailing list