[PATCH v2] drm/xe: flush gtt before signalling user fence on all engines

Mon Jun 3 10:19:49 UTC 2024

On 03.06.2024 11:31, Thomas Hellström wrote:
> On Mon, 2024-06-03 at 10:47 +0200, Thomas Hellström wrote:
>> Hi,
>>
>> On Mon, 2024-06-03 at 10:11 +0200, Andrzej Hajda wrote:
>>>
>>> On 03.06.2024 09:35, Thomas Hellström wrote:
>>>> On Thu, 2024-05-30 at 20:45 +0000, Matthew Brost wrote:
>>>>> On Thu, May 30, 2024 at 01:17:32PM +0200, Thomas Hellström
>>>>> wrote:
>>>>>> Hi, All.
>>>>>>
>>>>>> I was looking at this patch for drm-xe-fixes but it doesn't
>>>>>> look
>>>>>> correct to me.
>>>>>>
>>>>>> First, AFAICT, the "emit flush imm ggtt" means that we're
>>>>>> flushing
>>>>>> outstanding / posted writes, and then write a DW to a ggtt
>>>>>> address,
>>>>>> so
>>>>>> we're not really "flushing gtt"
>>>>>>
>>>>> So, is this a bad name? I think I agree. It could have been a
>>>>> holdover
>>>>> from the i915 names. Maybe we should do a cleanup in
>>>>> xe_ring_ops
>>>>> soon?
>>>>>
>>>>> Or are you saying that the existing emit_flush_imm_ggtt is not
>>>>> sufficient to ensure all writes from batches are visible? If
>>>>> this
>>>>> were
>>>>> true, I would think we'd have all sorts of problems popping up.
>>>> It was more the title of the patch that says "flush gtt" when I
>>>> think
>>>> it should say "flush writes" or something similar.
>>>>
>>>>
>>>>>> Second, I don't think we have anything left that explicitly
>>>>>> flushes
>>>>>> the
>>>>>> posted write of the user-fence value?
>>>>>>
>>>>> I think this might be true. So there could be a case where we
>>>>> get
>>>>> an
>>>>> IRQ
>>>>> and the user fence value is not yet visible?
>>>> Yes, exactly.
>>>>
>>>>> Not an expert ring programming but are instructions to store a
>>>>> dword
>>>>> which make these immediately visible? If so, I think that is
>>>>> what
>>>>> should
>>>>> be used.
>>>> There are various options here, using various variants of
>>>> MI_FLUSH_DW
>>>> and pipe_control, and I'm not sure what would be the most
>>>> performant
>>>> but I think the simplest solution would be to revert the patch
>>>> and
>>>> just
>>>> emit an additional MI_FLUSH_DW as a write barrier before emitting
>>>> the
>>>> posted userptr value.
>>> As the patch already landed I have posted fix for it:
>>> https://patchwork.freedesktop.org/series/134354/
>>>
>>> Regards
>>> Andrzej
>> I'm still concerned about the userptr write happening after the
> s/userptr/user-fence/
>
> /Thomas
>
>
>> regular
>> seqno write. Let's say the user requests a userptr write to a bo.
>>
>> 1) The LRC fence signals.
>> 2) Bo is evicted / userptr invalidated. pages returned to system.
>> 3) The user-fence writes to pages that we no longer own, or causes a
>> fault.

I am not user-fence expert, but shouldn't be user's responsibility of 
controlling user-fence life time till it can be used by kernel?
Making assumption that if the fence is located in some special area then 
this responsibility is taken away seems quite strange to me.

Regards
Andrzej

>>
>> /Thomas
>>
>>>>>
>>>>> We should also probably check how downstream i915 did this too.
>>>>>
>>>>>> and finally the seqno fence now gets flushed before the user-
>>>>>> fence.
>>>>>> Perhaps that's not a bad thing, though.
>>>>>>
>>>>> I don't think this is an issue, I can't think of a case where
>>>>> this
>>>>> reordering would create a problem.
>>>>>
>>>>> Matt
>>>> /Thomas
>>>>
>>>>>    
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>> On Wed, 2024-05-22 at 09:27 +0200, Andrzej Hajda wrote:
>>>>>>> Tests show that user fence signalling requires kind of
>>>>>>> write
>>>>>>> barrier,
>>>>>>> otherwise not all writes performed by the workload will be
>>>>>>> available
>>>>>>> to userspace. It is already done for render and compute, we
>>>>>>> need
>>>>>>> it
>>>>>>> also for the rest: video, gsc, copy.
>>>>>>>
>>>>>>> v2: added gsc and copy engines, added fixes and r-b tags
>>>>>>>
>>>>>>> Closes:
>>>>>>> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1488
>>>>>>> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver
>>>>>>> for
>>>>>>> Intel
>>>>>>> GPUs")
>>>>>>> Signed-off-by: Andrzej Hajda <andrzej.hajda at intel.com>
>>>>>>> Reviewed-by: Matthew Brost <matthew.brost at intel.com>
>>>>>>> ---
>>>>>>> Changes in v2:
>>>>>>> - Added fixes and r-b tags
>>>>>>> - Link to v1:
>>>>>>> https://lore.kernel.org/r/20240521-xu_flush_vcs_before_ufence-v1-1-ded38b56c8c9@intel.com
>>>>>>> ---
>>>>>>> Matthew,
>>>>>>>
>>>>>>> I have extended patch to copy and gsc engines. I have kept
>>>>>>> your
>>>>>>> r-b,
>>>>>>> since the change is similar, I hope it is OK.
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/xe/xe_ring_ops.c | 8 ++++----
>>>>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>> b/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>> index a3ca718456f6..a46a1257a24f 100644
>>>>>>> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>> @@ -234,13 +234,13 @@ static void
>>>>>>> __emit_job_gen12_simple(struct
>>>>>>> xe_sched_job *job, struct xe_lrc *lrc
>>>>>>>    
>>>>>>>    	i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
>>>>>>>    
>>>>>>> +	i =
>>>>>>> emit_flush_imm_ggtt(xe_lrc_seqno_ggtt_addr(lrc),
>>>>>>> seqno,
>>>>>>> false, dw, i);
>>>>>>> +
>>>>>>>    	if (job->user_fence.used)
>>>>>>>    		i = emit_store_imm_ppgtt_posted(job-
>>>>>>>> user_fence.addr,
>>>>>>>    						job-
>>>>>>>> user_fence.value,
>>>>>>>    						dw, i);
>>>>>>>    
>>>>>>> -	i =
>>>>>>> emit_flush_imm_ggtt(xe_lrc_seqno_ggtt_addr(lrc),
>>>>>>> seqno,
>>>>>>> false, dw, i);
>>>>>>> -
>>>>>>>    	i = emit_user_interrupt(dw, i);
>>>>>>>    
>>>>>>>    	xe_gt_assert(gt, i <= MAX_JOB_SIZE_DW);
>>>>>>> @@ -293,13 +293,13 @@ static void
>>>>>>> __emit_job_gen12_video(struct
>>>>>>> xe_sched_job *job, struct xe_lrc *lrc,
>>>>>>>    
>>>>>>>    	i = emit_bb_start(batch_addr, ppgtt_flag, dw, i);
>>>>>>>    
>>>>>>> +	i =
>>>>>>> emit_flush_imm_ggtt(xe_lrc_seqno_ggtt_addr(lrc),
>>>>>>> seqno,
>>>>>>> false, dw, i);
>>>>>>> +
>>>>>>>    	if (job->user_fence.used)
>>>>>>>    		i = emit_store_imm_ppgtt_posted(job-
>>>>>>>> user_fence.addr,
>>>>>>>    						job-
>>>>>>>> user_fence.value,
>>>>>>>    						dw, i);
>>>>>>>    
>>>>>>> -	i =
>>>>>>> emit_flush_imm_ggtt(xe_lrc_seqno_ggtt_addr(lrc),
>>>>>>> seqno,
>>>>>>> false, dw, i);
>>>>>>> -
>>>>>>>    	i = emit_user_interrupt(dw, i);
>>>>>>>    
>>>>>>>    	xe_gt_assert(gt, i <= MAX_JOB_SIZE_DW);
>>>>>>>
>>>>>>> ---
>>>>>>> base-commit: 188ced1e0ff892f0948f20480e2e0122380ae46d
>>>>>>> change-id: 20240521-xu_flush_vcs_before_ufence-a7b45d94cf33
>>>>>>>
>>>>>>> Best regards,