[PATCH] drm/amdgpu: fix ocl test performance drop

Mon May 22 06:47:33 UTC 2017

See my other comment about simplifying the patch in the other mail:
> [flora]: for compute ring & amdgpu_vm_ring_has_compute_vm_bug(), a 
> vm_flush is
> inserted. This might cause performance drop.
>
> Ah, I see. We only need the pipeline sync, but not the vm flush.
>
> In this case I suggest to just change the following line in 
> amdgpu_vm_flush():
>> -    bool vm_flush_needed = job->vm_needs_flush ||
>> -        amdgpu_vm_ring_has_compute_vm_bug(ring);
>
> We can keep it in amdgpu_vm_need_pipeline_sync().
With that done we can easily push it.

Christian.

Am 22.05.2017 um 05:42 schrieb Flora Cui:
> Could anyone give me a RB?
>
> On Fri, May 19, 2017 at 01:59:16PM +0200, Christian König wrote:
>> I'm pretty sure that bug is fixed in GFX9. So we don't need the check there
>> any more.
>>
>> Regards,
>> Christian.
>>
>> Am 19.05.2017 um 07:16 schrieb zhoucm1:
>>> I also realized this problem when I did fix for sched fence, I thought if
>>> no issue, just left it as current code, it's more safe.
>>> As Flora point it out, it results in performance drop, then we need to
>>> correct it back.
>>> pipeline_sync in vm flush should be inserted only when
>>> amdgpu_vm_ring_has_compute_vm_bug() is true, not all time for every vm
>>> flush for all engines. The bug has been fixed in mec firmware from version
>>> 673 in gfx8.
>>> About gfx9 for amdgpu_vm_ring_has_compute_vm_bug(), current implementation
>>> is thinking gfx9 doesn't have compute vm bug by default. Not sure it's in
>>> there. If no vm fault related to this bug, then we can think gfx9 has no
>>> compute vm bug, I believe firmware team does that based on previous
>>> generation, otherwise we need identify every generation, that's crazy.
>>>
>>> Regards,
>>> David Zhou
>>> On 2017年05月19日 11:36, Flora Cui wrote:
>>>> btw, what's about gfx9 for amdgpu_vm_ring_has_compute_vm_bug()? Is the
>>>> workaround still needed?
>>>>
>>>> On Fri, May 19, 2017 at 10:25:19AM +0800, Flora Cui wrote:
>>>>> On Thu, May 18, 2017 at 01:38:15PM +0200, Christian König wrote:
>>>>>> Am 18.05.2017 um 09:45 schrieb Flora Cui:
>>>>>>> partial revert commit <6971d3d> - drm/amdgpu: cleanup logic in
>>>>>>> amdgpu_vm_flush
>>>>>>>
>>>>>>> Change-Id: Iadce9d613dfe9a739643a74050cea55854832adb
>>>>>>> Signed-off-by: Flora Cui <Flora.Cui at amd.com>
>>>>>> I don't see how the revert should be faster than the original.
>>>>>>
>>>>>> Especially that amdgpu_vm_had_gpu_reset() is now called twice sounds
>>>>>> like
>>>>>> more overhead than necessary.
>>>>>>
>>>>>> Please explain further.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> ---
>>>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 +++++---------
>>>>>>>   1 file changed, 5 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> index 88420dc..a96bad6 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>> @@ -743,23 +743,19 @@ int amdgpu_vm_flush(struct amdgpu_ring
>>>>>>> *ring, struct amdgpu_job *job)
>>>>>>>           id->gws_size != job->gws_size ||
>>>>>>>           id->oa_base != job->oa_base ||
>>>>>>>           id->oa_size != job->oa_size);
>>>>>>> -    bool vm_flush_needed = job->vm_needs_flush ||
>>>>>>> -        amdgpu_vm_ring_has_compute_vm_bug(ring);
>>>>>>>       unsigned patch_offset = 0;
>>>>>>>       int r;
>>>>>>> -    if (amdgpu_vm_had_gpu_reset(adev, id)) {
>>>>>>> -        gds_switch_needed = true;
>>>>>>> -        vm_flush_needed = true;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    if (!vm_flush_needed && !gds_switch_needed)
>>>>>>> +    if (!job->vm_needs_flush && !gds_switch_needed &&
>>>>>>> +        !amdgpu_vm_had_gpu_reset(adev, id) &&
>>>>>>> +        !amdgpu_vm_ring_has_compute_vm_bug(ring))
>>>>>>>           return 0;
>>>>>>>       if (ring->funcs->init_cond_exec)
>>>>>>>           patch_offset = amdgpu_ring_init_cond_exec(ring);
>>>>>>> -    if (ring->funcs->emit_vm_flush && vm_flush_needed) {
>>>>> [flora]: for compute ring & amdgpu_vm_ring_has_compute_vm_bug(), a
>>>>> vm_flush is
>>>>> inserted. This might cause performance drop.
>>>>>>> +    if (ring->funcs->emit_vm_flush &&
>>>>>>> +        (job->vm_needs_flush || amdgpu_vm_had_gpu_reset(adev,
>>>>>>> id))) {
>>>>>>>           struct fence *fence;
>>>>>>>           trace_amdgpu_vm_flush(ring, job->vm_id, job->vm_pd_addr);
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170522/4aefd2dd/attachment-0001.html>