[Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
Goel, Akash
akash.goel at intel.com
Fri Aug 12 16:17:54 UTC 2016
On 8/12/2016 7:37 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 14:45, Goel, Akash wrote:
>>
>>
>> On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:
>>>
>>> On 12/08/16 07:25, akash.goel at intel.com wrote:
>>>> From: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>>
>>>> GuC ukernel sends an interrupt to Host to flush the log buffer
>>>> and expects Host to correspondingly update the read pointer
>>>> information in the state structure, once it has consumed the
>>>> log buffer contents by copying them to a file or buffer.
>>>> Even if Host couldn't copy the contents, it can still update the
>>>> read pointer so that logging state is not disturbed on GuC side.
>>>>
>>>> v2:
>>>> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
>>>> - Reduce the overall log buffer copying time by skipping the copy of
>>>> crash buffer area for regular cases and copying only the state
>>>> structure data in first page.
>>>>
>>>> v3:
>>>> - Create a vmalloc mapping of log buffer. (Chris)
>>>> - Cover the flush acknowledgment under rpm get & put.(Chris)
>>>> - Revert the change of skipping the copy of crash dump area, as
>>>> not really needed, will be covered by subsequent patch.
>>>>
>>>> v4:
>>>> - Destroy the wq under the same condition in which it was created,
>>>> pass dev_piv pointer instead of dev to newly added GuC function,
>>>> add more comments & rename variable for clarity. (Tvrtko)
>>>>
>>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_drv.c | 14 +++
>>>> drivers/gpu/drm/i915/i915_guc_submission.c | 150
>>>> +++++++++++++++++++++++++++++
>>>> drivers/gpu/drm/i915/i915_irq.c | 5 +-
>>>> drivers/gpu/drm/i915/intel_guc.h | 3 +
>>>> 4 files changed, 170 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>>> b/drivers/gpu/drm/i915/i915_drv.c
>>>> index 0fcd1c0..fc2da32 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
>>>> drm_i915_private *dev_priv)
>>>> if (dev_priv->hotplug.dp_wq == NULL)
>>>> goto out_free_wq;
>>>>
>>>> + if (HAS_GUC_SCHED(dev_priv)) {
>>>
>>> This just reminded me that a previous patch had:
>>>
>>> + if (HAS_GUC_UCODE(dev))
>>> + dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>>>
>>> In the interrupt setup. I don't think there is a bug right now, but
>>> there is a disagreement between the two which would be good to resolve.
>>>
>>> This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
>>> for correctness. I think.
>>
>> Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.
>>
>> As per Chris's comments will move the wq init/destroy to the GuC logging
>> setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
>> You are fine with that ?.
>
> Yes thats OK I think.
>
>>>
>>>> + /* Need a dedicated wq to process log buffer flush interrupts
>>>> + * from GuC without much delay so as to avoid any loss of
>>>> logs.
>>>> + */
>>>> + dev_priv->guc.log.wq =
>>>> + alloc_ordered_workqueue("i915-guc_log", 0);
>>>> + if (dev_priv->guc.log.wq == NULL)
>>>> + goto out_free_hotplug_dp_wq;
>>>> + }
>>>> +
>>>> return 0;
>>>>
>>>> +out_free_hotplug_dp_wq:
>>>> + destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>> out_free_wq:
>>>> destroy_workqueue(dev_priv->wq);
>>>> out_err:
>>>> @@ -782,6 +794,8 @@ out_err:
>>>>
>>>> static void i915_workqueues_cleanup(struct drm_i915_private
>>>> *dev_priv)
>>>> {
>>>> + if (HAS_GUC_SCHED(dev_priv))
>>>> + destroy_workqueue(dev_priv->guc.log.wq);
>>>> destroy_workqueue(dev_priv->hotplug.dp_wq);
>>>> destroy_workqueue(dev_priv->wq);
>>>> }
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index c7c679f..2635b67 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
>>>> intel_guc *guc,
>>>> return host2guc_action(guc, data, ARRAY_SIZE(data));
>>>> }
>>>>
>>>> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
>>>> +{
>>>> + u32 data[1];
>>>> +
>>>> + data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
>>>> +
>>>> + return host2guc_action(guc, data, 1);
>>>> +}
>>>> +
>>>> /*
>>>> * Initialise, update, or clear doorbell data shared with the GuC
>>>> *
>>>> @@ -840,6 +849,127 @@ err:
>>>> return NULL;
>>>> }
>>>>
>>>> +static void guc_move_to_next_buf(struct intel_guc *guc)
>>>> +{
>>>> + return;
>>>> +}
>>>> +
>>>> +static void* guc_get_write_buffer(struct intel_guc *guc)
>>>> +{
>>>> + return NULL;
>>>> +}
>>>> +
>>>> +static void guc_read_update_log_buffer(struct intel_guc *guc)
>>>> +{
>>>> + struct guc_log_buffer_state *log_buffer_state,
>>>> *log_buffer_snapshot_state;
>>>> + struct guc_log_buffer_state log_buffer_state_local;
>>>> + void *src_data_ptr, *dst_data_ptr;
>>>> + u32 i, buffer_size;
>>>
>>> unsigned int i if you can be bothered.
>>
>> Fine will do that for both i & buffer_size.
>
> buffer_size can match the type of log_buffer_state_local.size or use
> something else if more appropriate.
>
>> But I remember earlier in one of the patch, you suggested to use u32 as
>> a type for some variables.
>> Please could you share the guideline.
>> Should u32, u64 be used we are exactly sure of the range of the
>> variable, like for variables containing the register values ?
>
> Depends what the variable is. "i" in this case is just a standard
> counter so not appropriate to use u32. It is not that wrong on x86, just
> looks a bit out of place.
>
> grep "u32 i;" has no hits in i915. :)
>
> They can/should be used for variables tied with hardware or protocols.
> Or in some cases where you really want to save space by using a small type.
>
Thanks will be mindful of this.
>>
>>>
>>>> +
>>>> + if (!guc->log.buf_addr)
>>>> + return;
>>>
>>> Can it hit this? If yes, I think better disable GuC logging when pin map
>>> on the object fails rather than let it generate interrupts in vain.
>>>
Ideally it should not, as logging itself is disabled (interrupts also
disabled) on log buffer allocation failure.
>>>> +
>>>> + /* Get the pointer to shared GuC log buffer */
>>>> + log_buffer_state = src_data_ptr = guc->log.buf_addr;
>>>> +
>>>> + /* Get the pointer to local buffer to store the logs */
>>>> + dst_data_ptr = log_buffer_snapshot_state =
>>>> guc_get_write_buffer(guc);
>>>> +
>>>> + /* Actual logs are present from the 2nd page */
>>>> + src_data_ptr += PAGE_SIZE;
>>>> + dst_data_ptr += PAGE_SIZE;
>>>> +
>>>> + for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
>>>> + /* Make a copy of the state structure in GuC log buffer (which
>>>> + * is uncached mapped) on the stack to avoid reading from it
>>>> + * multiple times.
>>>> + */
>>>> + memcpy(&log_buffer_state_local, log_buffer_state,
>>>> + sizeof(struct guc_log_buffer_state));
>>>> + buffer_size = log_buffer_state_local.size;
>>>
>>> Needs checking (and logging) that a bad firmware or some other error
>>> does not report a dangerous size (too big) which would then overwrite
>>> memory pointed by dst_data_ptr memory. (And/or read from random memory.)
>>>
>> Have done the range checking for the read/write offset values, which are
>> updated repeatedly by GuC firmware.
>> The buffer size is updated only at init time by GuC firmware, hence less
>> vulnerable.
>>
>> But nevertheless will add the checks.
>
> Ok good.
>
>>>> +
>>>> + if (log_buffer_snapshot_state) {
>>>> + /* First copy the state structure in local buffer */
>>>
>>> Maybe "destination buffer" would be clearer?
>>
>> Sorry my bad, well spotted.
>>
>>>
>>>> + memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>>>> + sizeof(struct guc_log_buffer_state));
>>>> +
>>>> + /* The write pointer could have been updated by the GuC
>>>> + * firmware, after sending the flush interrupt to Host,
>>>> + * for consistency set the write pointer value to same
>>>> + * value of sampled_write_ptr in the snapshot buffer.
>>>> + */
>>>> + log_buffer_snapshot_state->write_ptr =
>>>> + log_buffer_snapshot_state->sampled_write_ptr;
>>>> +
>>>> + log_buffer_snapshot_state++;
>>>> +
>>>> + /* Now copy the actual logs */
>>>> + memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>>>> +
>>>> + src_data_ptr += buffer_size;
>>>> + dst_data_ptr += buffer_size;
>>>> + }
>>>> +
>>>> + /* FIXME: invalidate/flush for log buffer needed */
>>>
>>> Yes no maybe? :)
>>
>> Will like to keep it, if you allow.
>
> I think you need a really good justification to get r-b on patches with
> FIXMEs, especially like this one. Do you maybe handle it or remove it in
> a following patch or something?
>
Have removed this FIXME in the later patch
"[PATCH 17/20] drm/i915: Use uncached(WC) mapping for acessing the GuC
log buffer"
Best regards
Akash
> Regards,
>
> Tvrtko
More information about the Intel-gfx
mailing list