[Intel-gfx] [PATCH 06/20] drm/i915: Handle log buffer flush interrupt event from GuC
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Aug 12 14:07:19 UTC 2016
On 12/08/16 14:45, Goel, Akash wrote:
>
>
> On 8/12/2016 6:47 PM, Tvrtko Ursulin wrote:
>>
>> On 12/08/16 07:25, akash.goel at intel.com wrote:
>>> From: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>
>>> GuC ukernel sends an interrupt to Host to flush the log buffer
>>> and expects Host to correspondingly update the read pointer
>>> information in the state structure, once it has consumed the
>>> log buffer contents by copying them to a file or buffer.
>>> Even if Host couldn't copy the contents, it can still update the
>>> read pointer so that logging state is not disturbed on GuC side.
>>>
>>> v2:
>>> - Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
>>> - Reduce the overall log buffer copying time by skipping the copy of
>>> crash buffer area for regular cases and copying only the state
>>> structure data in first page.
>>>
>>> v3:
>>> - Create a vmalloc mapping of log buffer. (Chris)
>>> - Cover the flush acknowledgment under rpm get & put.(Chris)
>>> - Revert the change of skipping the copy of crash dump area, as
>>> not really needed, will be covered by subsequent patch.
>>>
>>> v4:
>>> - Destroy the wq under the same condition in which it was created,
>>> pass dev_piv pointer instead of dev to newly added GuC function,
>>> add more comments & rename variable for clarity. (Tvrtko)
>>>
>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.c | 14 +++
>>> drivers/gpu/drm/i915/i915_guc_submission.c | 150
>>> +++++++++++++++++++++++++++++
>>> drivers/gpu/drm/i915/i915_irq.c | 5 +-
>>> drivers/gpu/drm/i915/intel_guc.h | 3 +
>>> 4 files changed, 170 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>>> b/drivers/gpu/drm/i915/i915_drv.c
>>> index 0fcd1c0..fc2da32 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -770,8 +770,20 @@ static int i915_workqueues_init(struct
>>> drm_i915_private *dev_priv)
>>> if (dev_priv->hotplug.dp_wq == NULL)
>>> goto out_free_wq;
>>>
>>> + if (HAS_GUC_SCHED(dev_priv)) {
>>
>> This just reminded me that a previous patch had:
>>
>> + if (HAS_GUC_UCODE(dev))
>> + dev_priv->pm_guc_events = GEN9_GUC_TO_HOST_INT_EVENT;
>>
>> In the interrupt setup. I don't think there is a bug right now, but
>> there is a disagreement between the two which would be good to resolve.
>>
>> This HAS_GUC_UCODE in the other patch should probably be HAS_GUC_SCHED
>> for correctness. I think.
>
> Sorry for inconsistency, Will use HAS_GUC_SCHED in the previous patch.
>
> As per Chris's comments will move the wq init/destroy to the GuC logging
> setup/teardown routines (guc_create_log_extras, guc_log_cleanup)
> You are fine with that ?.
Yes thats OK I think.
>>
>>> + /* Need a dedicated wq to process log buffer flush interrupts
>>> + * from GuC without much delay so as to avoid any loss of logs.
>>> + */
>>> + dev_priv->guc.log.wq =
>>> + alloc_ordered_workqueue("i915-guc_log", 0);
>>> + if (dev_priv->guc.log.wq == NULL)
>>> + goto out_free_hotplug_dp_wq;
>>> + }
>>> +
>>> return 0;
>>>
>>> +out_free_hotplug_dp_wq:
>>> + destroy_workqueue(dev_priv->hotplug.dp_wq);
>>> out_free_wq:
>>> destroy_workqueue(dev_priv->wq);
>>> out_err:
>>> @@ -782,6 +794,8 @@ out_err:
>>>
>>> static void i915_workqueues_cleanup(struct drm_i915_private *dev_priv)
>>> {
>>> + if (HAS_GUC_SCHED(dev_priv))
>>> + destroy_workqueue(dev_priv->guc.log.wq);
>>> destroy_workqueue(dev_priv->hotplug.dp_wq);
>>> destroy_workqueue(dev_priv->wq);
>>> }
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index c7c679f..2635b67 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -172,6 +172,15 @@ static int host2guc_sample_forcewake(struct
>>> intel_guc *guc,
>>> return host2guc_action(guc, data, ARRAY_SIZE(data));
>>> }
>>>
>>> +static int host2guc_logbuffer_flush_complete(struct intel_guc *guc)
>>> +{
>>> + u32 data[1];
>>> +
>>> + data[0] = HOST2GUC_ACTION_LOG_BUFFER_FILE_FLUSH_COMPLETE;
>>> +
>>> + return host2guc_action(guc, data, 1);
>>> +}
>>> +
>>> /*
>>> * Initialise, update, or clear doorbell data shared with the GuC
>>> *
>>> @@ -840,6 +849,127 @@ err:
>>> return NULL;
>>> }
>>>
>>> +static void guc_move_to_next_buf(struct intel_guc *guc)
>>> +{
>>> + return;
>>> +}
>>> +
>>> +static void* guc_get_write_buffer(struct intel_guc *guc)
>>> +{
>>> + return NULL;
>>> +}
>>> +
>>> +static void guc_read_update_log_buffer(struct intel_guc *guc)
>>> +{
>>> + struct guc_log_buffer_state *log_buffer_state,
>>> *log_buffer_snapshot_state;
>>> + struct guc_log_buffer_state log_buffer_state_local;
>>> + void *src_data_ptr, *dst_data_ptr;
>>> + u32 i, buffer_size;
>>
>> unsigned int i if you can be bothered.
>
> Fine will do that for both i & buffer_size.
buffer_size can match the type of log_buffer_state_local.size or use
something else if more appropriate.
> But I remember earlier in one of the patch, you suggested to use u32 as
> a type for some variables.
> Please could you share the guideline.
> Should u32, u64 be used we are exactly sure of the range of the
> variable, like for variables containing the register values ?
Depends what the variable is. "i" in this case is just a standard
counter so not appropriate to use u32. It is not that wrong on x86, just
looks a bit out of place.
grep "u32 i;" has no hits in i915. :)
They can/should be used for variables tied with hardware or protocols.
Or in some cases where you really want to save space by using a small type.
>
>>
>>> +
>>> + if (!guc->log.buf_addr)
>>> + return;
>>
>> Can it hit this? If yes, I think better disable GuC logging when pin map
>> on the object fails rather than let it generate interrupts in vain.
>>
>>> +
>>> + /* Get the pointer to shared GuC log buffer */
>>> + log_buffer_state = src_data_ptr = guc->log.buf_addr;
>>> +
>>> + /* Get the pointer to local buffer to store the logs */
>>> + dst_data_ptr = log_buffer_snapshot_state =
>>> guc_get_write_buffer(guc);
>>> +
>>> + /* Actual logs are present from the 2nd page */
>>> + src_data_ptr += PAGE_SIZE;
>>> + dst_data_ptr += PAGE_SIZE;
>>> +
>>> + for (i = 0; i < GUC_MAX_LOG_BUFFER; i++) {
>>> + /* Make a copy of the state structure in GuC log buffer (which
>>> + * is uncached mapped) on the stack to avoid reading from it
>>> + * multiple times.
>>> + */
>>> + memcpy(&log_buffer_state_local, log_buffer_state,
>>> + sizeof(struct guc_log_buffer_state));
>>> + buffer_size = log_buffer_state_local.size;
>>
>> Needs checking (and logging) that a bad firmware or some other error
>> does not report a dangerous size (too big) which would then overwrite
>> memory pointed by dst_data_ptr memory. (And/or read from random memory.)
>>
> Have done the range checking for the read/write offset values, which are
> updated repeatedly by GuC firmware.
> The buffer size is updated only at init time by GuC firmware, hence less
> vulnerable.
>
> But nevertheless will add the checks.
Ok good.
>>> +
>>> + if (log_buffer_snapshot_state) {
>>> + /* First copy the state structure in local buffer */
>>
>> Maybe "destination buffer" would be clearer?
>
> Sorry my bad, well spotted.
>
>>
>>> + memcpy(log_buffer_snapshot_state, &log_buffer_state_local,
>>> + sizeof(struct guc_log_buffer_state));
>>> +
>>> + /* The write pointer could have been updated by the GuC
>>> + * firmware, after sending the flush interrupt to Host,
>>> + * for consistency set the write pointer value to same
>>> + * value of sampled_write_ptr in the snapshot buffer.
>>> + */
>>> + log_buffer_snapshot_state->write_ptr =
>>> + log_buffer_snapshot_state->sampled_write_ptr;
>>> +
>>> + log_buffer_snapshot_state++;
>>> +
>>> + /* Now copy the actual logs */
>>> + memcpy(dst_data_ptr, src_data_ptr, buffer_size);
>>> +
>>> + src_data_ptr += buffer_size;
>>> + dst_data_ptr += buffer_size;
>>> + }
>>> +
>>> + /* FIXME: invalidate/flush for log buffer needed */
>>
>> Yes no maybe? :)
>
> Will like to keep it, if you allow.
I think you need a really good justification to get r-b on patches with
FIXMEs, especially like this one. Do you maybe handle it or remove it in
a following patch or something?
Regards,
Tvrtko
More information about the Intel-gfx
mailing list