[PATCH 19/22] drm/amdgpu: adjust timeout for ib_ring_tests
Alex Deucher
alexdeucher at gmail.com
Mon Feb 26 15:55:50 UTC 2018
On Mon, Feb 26, 2018 at 12:18 AM, Monk Liu <Monk.Liu at amd.com> wrote:
> issue:
> sometime GFX/MM ib test hit timeout under SRIOV env, root cause
> is that engine doesn't come back soon enough so the current
> IB test considered as timed out.
>
> fix:
> for SRIOV GFX IB test wait time need to be expanded a lot during
> SRIOV runtimei mode since it couldn't really begin before GFX engine
> come back.
>
> for SRIOV MM IB test it always need more time since MM scheduling
> is not go together with GFX engine, it is controled by h/w MM
> scheduler so no matter runtime or exclusive mode MM IB test
> always need more time.
>
> Change-Id: I0342371bc073656476ad850e1f5d9a021846dc8c
> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 30 +++++++++++++++++++++++++++++-
> 1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 4709d13..d6776286 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -316,14 +316,42 @@ int amdgpu_ib_ring_tests(struct amdgpu_device *adev)
> {
> unsigned i;
> int r, ret = 0;
> + long tmo_gfx, tmo_mm;
> +
> + tmo_mm = tmo_gfx = AMDGPU_IB_TEST_TIMEOUT;
> + if (amdgpu_sriov_vf(adev)) {
> + /* for MM engines in hypervisor side they are not scheduled together
> + * with CP and SDMA engines, so even in exclusive mode MM engine could
> + * still running on other VF thus the IB TEST TIMEOUT for MM engines
> + * under SR-IOV should be set to a long time.
> + */
> + tmo_mm = 8 * AMDGPU_IB_TEST_TIMEOUT; /* 8 sec should be enough for the MM comes back to this VF */
> + }
> +
> + if (amdgpu_sriov_runtime(adev)) {
> + /* for CP & SDMA engines since they are scheduled together so
> + * need to make the timeout width enough to cover the time
> + * cost waiting for it coming back under RUNTIME only
> + */
> + tmo_gfx = 8 * AMDGPU_IB_TEST_TIMEOUT;
> + }
> +
> + adev->accel_working = true;
This change seems unrelated.
>
> for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
> struct amdgpu_ring *ring = adev->rings[i];
> + long tmo;
>
> if (!ring || !ring->ready)
> continue;
>
> - r = amdgpu_ring_test_ib(ring, AMDGPU_IB_TEST_TIMEOUT);
> + /* MM engine need more time */
> + if (ring->idx > 11)
Please check ring type here rather than the idx since the idx may vary
based on the number of IPs on the SOC.
Alex
> + tmo = tmo_mm;
> + else
> + tmo = tmo_gfx;
> +
> + r = amdgpu_ring_test_ib(ring, tmo);
> if (r) {
> ring->ready = false;
>
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
More information about the amd-gfx
mailing list